Alexandra Bowie Consulting: 2013

Tuesday

The Cost Disease, Part 2

In an earlier post, I reviewed "The Cost Disease: Why Computers Get Cheaper and Health Care Doesn't" by William J. Baumol and others. I highly recommend the book, for reasons I set out in the in the review.

Those reasons mostly did not have to do with health care. So I wanted to point out a few of the suggestions Baumol makes in the chapter "Yes, We can Cut Health Care Costs Even if we Cannot Control Their Growth Rate." The cost disease, as Baumol and company define it, relates to the rate of growth of costs. But we can still limit some costs - which will reduce the cost level. How do we do so? Here are Baumol's suggestions:

Use statistical methods to improve the evaluation of medical treatments (Baumol offers several cautions, including ones I've discussed before - be aware of sampling errors, don't confuse correlation and causation).
Avoid harmful or unnecessary treatments and procedures - he cites the rising C-section rate as one example.
Increase the use of genetic information to guide medication and treatment.
Identify less expensive treatments, new and old.
Practice preventive medicine
Make lifestyle changes - more exercise, consume fewer fats
Reform the medical liability system.
Make changes in medical education, and changes in health insurance practices.

Friday

Google's Tubes in pictures.

The Internet may be a series of tubes - and wires and pipes that hold them. They are big. And, at the moment, colorful, as you can see in this series of pictures of Google's data centers from Forbes Magazine. Here's one more, of cooling pipes:

The series as a whole is spectacular. Take a look.

Thursday

Is wine tasting reliable and consistent? Study says no

Think you can tell good wine from less good wine in a blind tasting? Think again. Robert Hodgson, a professor turned vintner, has published a study analyzing the performance of expert judges in the California State Fair wine competition for the years 2005-2008. His conclusion? In about half the cases the wine, and only the wine was the deciding factor.

How could he tell? Judges try wine in flights of about 30 wines each. The researchers included three different pours of four wines in one of the flights, so each judge tried four wines three times. The wine was poured from the same bottle each time. You can read the full article here. Interestingly, the article suggests that judges were more consistent at judging wine they thought was of very low quality.

But wait, there's more. Hodgson was able to compare judge performance from year to year. According to this article in the Guardian:

"The results are disturbing," says Hodgson from the Fieldbrook Winery in Humboldt County, described by its owner as a rural paradise. "Only about 10% of judges are consistent and those judges who were consistent one year were ordinary the next year.

Wine is complex, and a lot goes into tasting it, including the wine's temperature and what the taster ate earlier that day. So if you pick wine by the medals it has won, well, maybe you'll like it but maybe you won't.

According to the Guardian there does appear to be a scientific basis for the practice of drinking white wines while eating fish.

Researchers from Japanese drinks firm Mercian tested 64 varieties of wine with scallops, and concluded that the iron content of red wine speeded up the decay of fish, resulting in an overly ‘fishy’ taste.

How do you pick wine?

Thanks to Eli Molin for the article in the Guardian. Image via Clown Fish Wines.

Monday

Last week, Eduardo Porter of the New York Times wrote this column about how hard it is to become - or remain - middle class in this country. The article is illustrated by those graphs in the screenshot. One statistic Porter calls staggering is that "the typical household made $51,017, roughly the same as the typical household made a quarter of a century ago." Sure, according to the graph, the median household income had ticked up above that during the late 1990s and early 2000s, but the trend has been down again since the 2008 financial crisis. Equally, shocking, we still have approximately 15% of our population living below the poverty line, a number that has been increasing over the last five years.

The surface explanation is also in those charts: the richest quintile has increased its share of income to 49.9 percent of total income, from 42.1 percent starting back in 1967. According to this article, in terms of net worth, the top one percent owns 34.6% of the wealth, and the next 19% owns 50.5%. The bottom 80% owns 15% of the wealth. That makes the middle mighty small.

And that fact has had some repercussions. In 2012 Porter gathered some statistics about social well-being here.

It is not just that income inequality is the most acute of any industrialized country. More American children die before reaching age 19 than in any other rich country in the O.E.C.D. More live in poverty. Many more are obese. When they reach their teenage years, American girls are much more likely to become pregnant and have babies than teenagers anywhere else in the industrial world.

We understand the importance of early childhood development. Yet our public spending on early childhood is the most meager among advanced nations. We value education. Yet our rate of enrolling 3- to 5-year-olds in preschool programs is among the lowest among advanced nations. Our 15-year-olds place 26th out of 38 countries on international tests of mathematical literacy, according to the O.E.C.D. The first nation to understand the value of widespread college education, the United States has dropped from the top to the middle of the pack of our economically advanced peers in terms of college graduation rates.

Porter has also gathered statistics about economic inequality here. We pay lower taxes than other industrialized nations, and we seem not to mind giving up government services as a result.

The big exception has been the United States. In 1965, taxes collected by federal, state and municipal governments amounted to 24.7 percent of the nation’s output. In 2010, they amounted to 24.8 percent. Excluding Chile and Mexico, the United States raises less tax revenue, as a share of the economy, than every other industrial country.

Biblical Floods in Colorado

You've probably been hearing about the epic-Biblical-thousand year floods that Boulder, Colorado is experiencing. The cause is record rainfall - as you can see from the chart, above, developed by Climate Central. In fact, according to Weather Underground and Climate Central, Boulder, which normally gets 1.7 inches of rain in September and 20.68 for the year, got half a year's rain in less than half the month of September. (The forecast has a small chance of rain today, and then sunshine for the next few days.)

What might be causing all the rain? The Pacific. According to Climate Central:

During the past couple of weeks, the weather across the West has featured both an active Southwest Monsoon and a broad area of low pressure at upper levels of the atmosphere, which has been pinned by other weather systems and prevented from moving out of the region. It was this persistent low pressure area that helped pull the moisture out of the tropics and into Colorado. Signs point to the tropical Pacific being the source of the abundant moisture according to the University of Wisconsin’s Cooperative Institute for Meteorological Satellite Studies. From there, the moisture plume was transported northeastward, over Mexico and into Texas, and then northward by upper level winds.

This tropical air mass, which is more typical of the Gulf Coast than the Rocky Mountains, has been forced to move slowly up and over the Front Range by light southeasterly winds. This lifting process, known as orographic lift, allowed the atmosphere to wring out this unusually bountiful stream of moist air, dumping torrents of rain on the Boulder area for days on end.

That's a screen shot of the satellite images loop CIMSS released showing the tropical air mass. (I couldn't find it to embed it, but click on the link to Climate Central - you can see it moving there.)

Is climate change involved? No one weather event can be traced back easily to climate change, but there is at least one suggestive factor: the magnitude of the change from past events. And, of course, temperatures are rising around the globe. Generally, warmer temperatures mean more water vapor in the air, which means more extreme rain or snowfall. Stay tuned.

Tuesday

Measurement of intangibles at Harvard Business School

I read this article, published in the New York Times on September 7, about the administration's efforts to ensure gender equity at Harvard Business School with a great deal of interest. It's partly for the detailed glimpse into another world, and it's partly due to some of the great comments people made.

But during [graduation] week’s festivities, the Class Day speaker, a standout female student, alluded to “the frustrations of a group of people who feel ignored.” Others grumbled that another speechmaker, a former chief executive of a company in steep decline, was invited only because she was a woman. At a reception, a male student in tennis whites blurted out, as his friends laughed, that much of what had occurred at the school had “been a painful experience.”

The article is relevant here because of the focus of the administrators on measurement. Here's what they did:

* Turned what was subjective into an objective measure: Women in the B School lagged in class participation - but participation grades are both subjective and dependent on memory. The business school administrators posted stenographers in every class so faculty no longer had to rely on their memories.

* Provided information quickly: As the reporter describes it, "[n]ew grading software tools let professors instantly check their calling and marking patterns by gender." Information was used to identify and change behavior, not to punish. In fact, it appears that the administrators pushed responsibility for change down to the front lines, making it a team effort. One professor stated that the message he got from the administration was: “We’re going to solve it at the school level, but each of you is responsible to identify what you are doing that gets you to this point.” Management trusted workers to get the message, and to change.

* Developed a theory and tested it: the article reports that an additional factor, one the school really couldn't control, was contributing to the women's minimal participation. Social success was as important as academic success, possibly more important, and class participation could hurt social capital.

As you have probably surmised, all this was expensive and time consuming. At least so far, the B School administrators haven't been able to try other approaches. And it's too soon to tell whether salaries for men and women will be comparable 10 years or more after graduation. The atmosphere, as reported, sounds a lot better, but, as the article points out, the experiment brought with it unintended consequences and helped other issues - like class differences - surface. The story's not done yet, but the tale so far makes for fascinating reading. (The graphics online are much better than they were in the physical newspaper.) And so do the comments.

Update: This post has been updated to correct a typo.

Wednesday

Data Mining and You

You may have read the article in Sunday's New York Times or other news coverage about the efforts of Acxiom, a data mining company, to make an individual's data available to him or her. The site went live today in beta - that's a screenshot of the opening page - and I checked it out.

You have to provide some information about yourself: address, birth date, and the last four numbers of your social security number. That's so Acxiom can authenticate that you are really you. When you do so, you get data in several categories, what Acxiom calls core data (address, phone number, age) and derived insights (inferences derived from your core data - eg, whether you like cooking).

So what happens when you explore the data? It was kind of interesting. When I looked myself up, it had the basics of age and address right. There's not a lot of data about our housing, but our housing type means that the records are corporate, not individual, so it makes sense that there wouldn't be much information. Our car is in my name, so finding nothing about the car was a bit more of a surprise. The records indicated the presence of one of our children, with an incorrect age. There was some incomplete information about my online shopping habits - a bit skewed by the fact that we bought a lot of bed linens online when one of the kids left for college. Two years ago.

Things got more interesting when it came to my spouse. He has been conflated with someone with the same name and a similar birthday who is 20 years older. That person bought an expensive condo in 2007, is of a different religion, owns a car and plays golf. So what's our conclusion? No one here is going to lose any sleep over consumer data mining.

The Aboutthedata site allows you to correct the information. There seems to be little downside in doing so, but also little need to. The company argues that more accurate information means that you'll get more relevant offers in all those annoying little side ads that appear when you do web searches (or go on Facebook). I try to ignore them but sometimes find them entertaining. You also have an opt-out option, but that won't give you any fewer ads, the site explains. It will just make them less relevant.

What do you think?

Tuesday

That picture is moss growing on what we once thought of as a cold, snow-covered continent: Antarctica. I wrote a post earlier this summer about a plants that had come in from the cold and that seemed kind of exciting. Almost sweet, in a way, that plants that had lived under the ice for so long could still bloom. But, as grist.org points out, this patch of moss is yet another signal of long-term climate change. Scientists report in the underlying article, available here, that "growth rates and microbial productivity have risen rapidly since the 1960s, consistent with temperature changes. . . " though growth seems to have tapered off in the most recent years. (They don't say why, but don't assume it means that the global climate has finished changing.)

Image via grist.org

Wednesday

Yosemite fire in images

That spectacular but frightening photograph is of the Yosemite fire, one of several that TheAtlantic.com published yesterday, available here.

And from Grist.org, here is what they call a list of 9 scary facts about that fire. If you have time, watch the video they've embedded - taken from a plane dumping retardant at the edge of the fire, it's got some amazing views.

Amazon deforestation - as seen from space

There's a series of GIFs from Google Earth's Landsat satellite images available - that's the deforestation of the Brazilian Amazon, above. The picture quality isn't great, but the degree of change is pretty clear. You can also see a series showing urban growth around the world here.

Friday

Code for America's useful new startups

I've embedded the Code for America webinar introducing the first three startups its incubator has produced. It's an interesting hour in which three developers describe their software: all three applications provide tools that engage citizens with local government, not-for-profits, and community groups. I can think of lots of uses for each of them. They are:

Localdata - Localdata provides tools, both electronic and paper and pencil, that allows local organizers to collect data - on anything: how many trees need trimming? Where are the commercial corridors in a neighborhood? Once data are collected, the app allows you to analyze and map the data easily.

Textizen - Provides a text message survey platform - each survey gets its own phone number and residents phone in their thoughts. And each response can be turned into a conversation, engaging the resident more deeply.

CivicInsight - CivicInsight brings all government data about a community's empty spaces available in one place in a way that is easy to understand. You can try it out for New Orleans.

All three apps provide quick analytics.

You can see my last post about Code for America here.

Thursday

Mark Edmundson on teaching - and learning

If you haven't yet read about it, Mark Edmundson's new book "Why Teach?" sounds quite interesting - for a start, to anyone with kids in high school and college, but also for people with an interest in the future of education (not to mention the costs of college). There's an excerpt from the book on TheAtlantic.com titled "'Where Should I Go to College?'" in which Edmundson distinguishes universities that are more scholarly enclaves from those he calls "corporate cities." (He acknowledges that neither exists "in its pure form.") Here's how Edmundson outlines high school preparation for either kind of college:

High school now is about being an all-arounder. You've got to be good at your classes, but you've also got to shine as a citizen and a general hand- waving, high- enthusiasm participant. To do this, you've often got to make yourself into a superb time manager. You give each activity the amount of time and effort required so that you can reach the so- called standard of excellence. You give it that much, but you give it no more. Do I really need to read the whole book to get an A in English, the student asks herself? Probably she doesn't. Do I need a tutor and extra time to score a top grade in math? Perhaps yes. If so, the money is well spent and so is the time. Will it look better to put in two hours a week volunteering at the hospital or four at the soup kitchen? Does the guidance counselor say that both will look about the same to the college admissions board? Then better to do the hospital: You'll need those extra two hours for prom committee.

The article is worth reading in its entirety. You can also read a review of the book in the NY Times here. I hope you - and your kids - find the great teachers out there.

Monday

Peter Singer has an interesting Op-Ed "Good Charity, Bad Charity" in yesterday's New York Times, in which he argues that there are clear answers to the question 'to which charity should I donate?' He argues that there is a stark choice between donating to organizations that provide medical and social services, and to cultural organizations like museums.

I tend to agree, though I question his assumption that donating to a cause overseas is more important than donating to an organization that serves people in the US. (I can also see an argument that funding a museum is important - future artists need a place to go and view art, and without public museums most art would be in private collections. The rest of us sometimes need to see art too.)

But what's most interesting about Singer's piece, and the reason I'm linking to it here, is his final point: there is now objective evidence of a charity's effectiveness available, and donors can use it as part of their decision-making. Singer links to both the sites of GiveWell and GiveDirectly. The former ranks charities by their effectiveness. It's still fairly small, and focuses on finding outstanding charities rather than ranking all (or many) charities but to its credit GiveWell is open about its processes and its mistakes.

GiveDirectly, which is highly rated by GiveWell, transfers donations directly (and electronically) to recipients' cell phones. Recipients then use the money for whatever is important to them. GiveDirectly reports:

The most frequent self-reported use of funds is purchasing a metal roof. We estimate the annual rate of return on on metal as opposed to thatch roofing to be 15%-20%, suggesting this is an attractive investment.
1% of recipients report regrets about the way they used their transfer. For example, one woman chose not to pursue a business opportunity but later wished that she had.
1% of recipients report having had some of their transfer stolen.
On net, 100% report being better-off as a result of the transfer.

Helping organizations understand the impact of the work they perform is one of the most important things I do so it's heartening to see the progress. Organizations don't have to wait for a a GiveWell to tell them how they're doing - with a little effort, they can do it themselves. It's well worth the investment.

Thursday

Global Temperatures by Decade

I can't resist sharing this terrific graph that came via Grist.org. The dotted gray line is the long-term average for the period 1961-1990 so if the last two decades were included it would be much higher.

Grist quotes from the associated report:

The rapid changes that have occurred since the middle of the past century, however, have been caused largely by humanity’s emissions of greenhouse gases into the atmosphere. Other human activities also affect the climate system, including emissions of pollutants and other aerosols, and changes to the land surface, such as urbanization and deforestation.

Tuesday

A summer respite

Blogging will be spotty over the next few months as I will be doing some traveling - see you occasionally, and definitely in September.

Friday

Bloomberg's Billionaires

Data visualization can serve two purposes and two audiences, says Lisa Strausfeld, Global Head of Data Visualization at Bloomberg LP. For the novice, it can serve as an explanation; for the expert, data visualization can guide exploration. Bloomberg Billionaires - that's a screenshot above - does a bit of both. It's interactive - that's billionaires plotted by industry above. But you can also change the day, and filter sets of data. For example, the screenshot below is the net worth rankings as of March 14, 2012:

Those handy little pop-up flags link to recent stories.

You can also filter by industry, citizenship, gender, and source of wealth (inherited or self-made). Interestingly, Bloomberg himself does not appear on the list.

Thursday

A satellite's view of the Earth

This spectacular image comes from Infinity Imagined - and shows a day's worth of weather. Photos are from the GOES-14 weather satellite - taken last week, May 22d. The individual pictures are here. Take a look at the full blog for more spectacular images.

Tuesday

Retreat of glaciers uncovers old plants

One more effect of climate change is becoming evident: plants that were frozen under ice for centuries are reviving and growing. Biologists from the University of Alberta have discovered bryophytes that last grew before the Little Ice Age (1550-1850). They are newly uncovered, and appeared to be growing in the wild. They grew in the lab, too. The abstract and article are here. A news report from the BBC is here. Here's what Catherine La Farge, the lead biologist, had to say:

"We ended up walking along the edge of the glacier margin and we saw these huge populations coming out from underneath the glacier that seemed to have a greenish tint," said Catherine La Farge, lead author of the study.

. . .

"When we looked at them in detail and brought them to the lab, I could see some of the stems actually had new growth of green lateral branches, and that said to me that these guys are regenerating in the field, and that blew my mind," she told BBC News.

"If you think of ice sheets covering the landscape, we've always thought that plants have to come in from refugia around the margins of an ice system, never considering land plants as coming out from underneath a glacier."

Friday

Review of "Big Data"

I had mentioned in a post that I was looking forward to reading "Big Data: A Revolution that will Transform How We Live, Work, and Think" by Viktor Mayer-Schonberger and Kenneth Cukier. Mayer-Schoberger and Cukier live (or at least write) by the law of threes, and they have three good points to make about the big data culture whose development we are witnessing:

We will have so much more data that we won't need to sample;
More data means that we won't need to worry so much about exactitude; and
We will make decisions based on correlation, not causality.

Naturally, each of these ideas, when it gets a chapter of its own, gets developed a little further. (Each chapter has a clever one-word title.) In the chapter "More" the authors point out that, while a truly random sample can be quite small, it can be difficult to obtain one. Systematic biases, for example, often taint the collection process. Further, if you are analyzing a small, random sample, you often do not have enough data points to drill down further into the data. Modern computing power and the huge amount of data now available mean that analysts don't have to limit themselves to samples. Much bigger datasets allow us to "spot connections and details that are otherwise cloaked in the vastness of the information." So, they conclude, with data, bigger really is better.

Large datasets, they go on in a chapter entitled "Messy," will have several types of errors: some measurements will be wrong; combining different datasets that don't always match up exactly will give approximations, rather than exact numbers. But the tradeoff, say the authors, is worth it. They provide as an example language translation programs - simple programs and more data are better at accurate translation than complex models with less data. They are careful to add that the results are not exact. "Big data transforms figures into something more probabilistic than precise."

The chapter "Correlation" explains why it's not so important to know "why" when you can know, through correlations, "what" happens, or, to put it more precisely, what is more likely to happen. As the authors put it, with correlations, "there is no certainty, only probability." As a result, we need to be very chary of coincidence. (We often think we see causality when in fact we have observed correlation. Or coincidence.) They add that correlations can point the way to test for causal relationships.

So far, so good. The authors go on to chapters about the turning of information into data, and the creation or capture of value. The book is written in a breezy, accessible style; it never mentions the term "Bayesian," for example, although that is clearly what the authors are talking about. But towards the end the energy peters out, and the final chapters feel like filler. The chapter "Risks," which raises some entirely speculative concerns - that we might be punished simply for our "propensity" to behave in a certain way, for example - feels rushed and empty. Its over-simplification of the US criminal justice system made me wonder what else might have been altered beyond recognition. So read the first part of the book for its useful outline of what big data entails, but go elsewhere for a more serious discussion of the policy implications.
Image via Amazon.com

Wednesday

Sixty years of tornadoes

That screenshot? It's the tracks of all the tornadoes over the last 60 years, at least those that caused enough damage to be recorded. It's intimidating, but a little misleading, since that's all of them since 1951. You can see a video of the tracks by year below, produced by IDV Solutions. Each trail (or dot) is an individual tornado; the fiercer the winds the brighter the trail.

You can also find a longer video with the tracks by month here.

NOAA answers some basic questions about tornadoes, here.

Tuesday

Interactive wildfire map

Thanks to our friends at Climate Central for this handy map showing active wildfires in the US. If you click in you can get the name of the fire, the fire's size in acres, and other information. The map is updated daily.

Monday

One simple - too simple? - graph to explain US economy's performance

The graph comes from Thomson Datastream via Derek Thompson of TheAtlantic.com - and it shows that the US economy's performance over the last five years was better than that of comparable developed countries: a shallower recession with a faster recovery. Thompson attributes this performance to the facts that:

(a) control our own currency and (b) used aggressive monetary policy to save the banks and lower interest rates while running high deficits.

Do you agree? What's your interpretation of the graph?

Thursday

The importance of context

The New York Times ran an article in today's paper describing the differences in methodology used to develop suicide rates for the military, and the methodology used to develop the rate for the civilian population. The Times says that Pentagon medical statisticians use

a total population figure that includes all Guard members or reservists who spent any period of time on active duty in a given year, even if it was only a few days. According to that approach, the total active military population was about 1.67 million for all of 2009, a review of Pentagon data shows.

But at almost any given moment, the United States military is much smaller than that. Another office of the Pentagon, the Defense Manpower Data Center, the personnel record-keeping office, used a total population number of about 1.42 million service members in 2009. That figure was calculated by including only National Guard and reserve troops who had been on active duty for at least six months in a given year.

Therefore, because the denominator is too large, the military has been understating the suicide rate. (You can find a reasonably explanation of calculation of a rate here.) Why is this important? Because when the military rate and the civilian rates are comparable, there's less of a problem.

Tuesday

Global warming in Alaska

The Guardian is running a powerful series on the impact of global warming on life in the indigenous villages in Alaska. Nearly 200 are under threat - that's a threat of washing away:

A study by the US Army Corps of Engineers on the effects of climate change on native Alaskan villages, the one that predicted the school would be underwater by 2017, found no remedies for the loss of land in Newtok.

The land was too fragile and low-lying to support sea walls or other structures that could keep the water out, the report said, adding that if the village did not move, the land would eventually be overrun with water.

The second screenshot shows the extent of Arctic sea ice melt. Climate change is happening fast in Alaska - in addition to villages at risk, animal habitats are changing. The series continues tomorrow.

Update, May 15: See this article in Scientific American about the possible impact of sea level rising along the East Coast: the five foot rise in sea level over the next century will mean that a storm of Sandy's impact could occur much more often.

Monday

Atmospheric carbon dioxide continues to increase

Update, May 14: You can read Climate Central's take on why this is an important measure here.

Two weeks ago I wrote a post about the Keeling Curve, which measures the concentration of carbon dioxide in the atmosphere and explains why that's important. If you're wondering why NOAA reported that the Earth had, on Friday, the threshold level of 400ppm of carbon dioxide in the atmosphere, but the Scripps Institute of Oceanography did not, there's a simple explanation - time zones. As a note on the Scripps site puts it:

May 10 Comment: NOAA has reported 400.03 for yesterday, but Scripps has reported 399.73. The difference is similar to other differences we have reported. The difference partly reflects time zone differences. NOAA uses UTC, whereas we use local time in Hawaii to define the start and stop of a given day. Changing to UTC excludes the lower CO₂ period from the baseline on the May 9, shifting it to May 10.

399.73 or 400.03 - both are bad. There's a good roundup of this and other climate news on the blog "Scrapbook of a Climate Hawk," here.

Friday

Meat, vegetables, and calcium chloride: a food diary from around the world

Nutritionists tell you that an easy way to make sure you are eating a balanced diet is to make sure you have a lot of colors on your plates. Here, from the photography website Fstoppers, is a look at a week's worth of - often colorful - groceries from around the world. It's fascinating. The US and UK families have a lot of colors - in their packaging. (That's a screenshot of the UK family, above.)

The Australian family adds a great deal of meat:

And the Italian a great deal of bread:

This family lives in Chad:

and this one in Guatemala:

Whose meals would you rather join?

Wednesday

Eric Schmidt on disruptive technologies

McKinsey (free when you register) has posted an interesting interview with Eric Schmidt, Executive Chairman of Google, on disruptive technologies - those "likely to have the greatest impact on economies, business models, and people." (You can also read a transcript if you prefer.)

Schmidt points out that the main issue is the explosion in knowledge technology:

We’re going, in a single lifetime, from a small elite having access to information to essentially everyone in the world having access to all of the world’s information. That has huge implications for privacy, communications, security, the way people behave, the way information is spread, censorship, how governments behave, and so forth.

The McKinsey editors focus on four areas of Schmidt's discussion:

1. Biology is going digital - in the past few years, much of what was analog in biology, like how proteins are folded or how DNA works, can now be modeled. Proteins are one example - proteins have complex structures that are hard to predict. (If you haven't seen it, the website foldit challenges users to find the best way to fold different structures and predict the most like structure of a particular protein.) Digital tools in biology should improve health care, though medical care is likely to continue its rapid change.

2. New materials, new ways of manufacture - Schmidt points out that new materials can now be manufactured at a large scale, and new means of production, like 3-D printers, are rapidly becoming available. He's not making specific predictions, but the general statement he makes is compelling:

So that revolution, plus the arrival of three-dimensional printing, where you can essentially build your own thing, means that—during the rest of our lifetimes, anyway—it’ll be possible to build very interesting things from very interesting, new materials, which have all sorts of new properties.

This might be both good and bad - there have been reports recently of guns made using 3-D printers - but it is worth thinking about.

3. Using computers to support decision-making. We can think about using computers in all sorts of ways beyond gaming and communicating. Schmidt talks about different interfaces (Siri, anyone?) but captures the essence when he says this:

And the ultimate model is that the computer does what it does well, which is these complicated, analytical needle-in-a-haystack problems, and has perfect memory. And humans do what we do well, which is judgment, and having fun, and thinking about things. The relationship is symbiotic. The computer is making suggestions that are pretty good, they’re pretty helpful, but you’re ultimately in charge.

4. Education is important - machines are taking over what low-wage workers once did - Schmidt's example is supermarket checkouts. That leaves plenty of formerly low-wage workers without jobs. They need better education, Schmidt argues. He follows that with a pitch for more immigration of high-skilled workers. "'[Y]ou want an unfair share of highly educated people."

It's an interesting interview and a great starting point for thinking about the issues Schmidt raises. What do you think of his points? My examples?

Monday

Statistics in context

These might seem like unrelated subjects, but they're not entirely - both are about how to apply statistical analysis in a context that might not seem like a reasonable candidate. The first, "The Evolution of King James" by Kirk Goldsberry, is about LeBron James and his improvement in scoring:

Over the years, James has attempted thousands of field goals, but those shots are going in at much higher rates recently. In James's rookie year he shot 42 percent from the field and 29 percent from beyond the arc. This year those numbers are 56 percent and 39 percent, respectively. There are two reasons for that substantial improvement in his field goal percentage: (1) He's a much better shooter now, and (2) also a larger share of his shots are close to the basket now.

How did he make the change? And almost more important, how did he know that he wanted to make a change and what change to make? James listened to some commentary. He thought hard about his game. And he changed it, going from a 3-point and wing shooter to a post shooter. Here are his most common shot locations, during James's first and second years in Miami:

The story is about hard work - grueling work - and about using numbers, and context, to guide that work.

The other story, "Solving Equation of a Hit Film Script, With Data," by Brooks Barnes in today's New York Times has generated a large number of comments and shot almost to the top of the most emailed list. when I read the article this morning I was initially in the camp of "you can't measure art" until I got to these paragraphs:

Mr. Bruzzese emphasized that his script analysis is not done by machines. His reports rely on statistics and survey results, but before evaluating a script he meets with the writer or writers to “hear and understand the creative vision, so our analysis can be contextualized,” he said.

But he is also unapologetic about his focus on financial outcomes. “I understand that writing is an art, and I deeply respect that,” he said. “But the earlier you get in with testing and research, the more successful movies you will make.”

The service actually gives writers more control over their work, said Mark Gill, president of Millennium Films and a client. In traditional testing, the kind done when a film is almost complete, the writer is typically no longer involved. With script testing, the writer can still control changes.

One Oscar-winning writer who, at the insistence of a producer, had a script analyzed by Mr. Bruzzese said his initial worries proved unfounded.

“It was a complete shock, the best notes on a draft that I have ever received,” said the writer, who spoke on the condition of anonymity, citing his reputation.

It's partly the comment about context. But it's also partly, I think, the acknowledgment that Bruzzese is doing some interpreting too. What do you think?

Friday

Warming ocean surface temperatures, and a cool chart

Did the water at your East Coast beach seem warmer than usual last summer? That's because it was: sea surface temperatures for the Northeast Shelf Ecosystem, which reaches from Cape Hatteras, North Carolina to the Gulf of Maine, reached a record high of 14 degrees Celsius in 2012, higher than the average of 12.4 degrees Celsius for the past 30 years. That's according to a new report from NOAA's Northeast Fisheries Science Center.

And it's not just the surface temperatures that are increasing - the warm water thermal habitat was at a record high, while cold water habitat was at a record low. Warm water went deeper than usual, and the habitat is changing. What is the impact? According to NOAA,

Temperature is also affecting distributions of fish and shellfish on the Northeast Shelf. The advisory provides data on changes in distribution, or shifts in the center of the population, of seven key fishery species over time. The four southern species - black sea bass, summer flounder, longfin squid and butterfish - all showed a northeastward or upshelf shift. American lobster has shifted upshelf over time but at a slower rate than the southern species. Atlantic cod and haddock have shifted downshelf.”

You can see the movement in the chart at the top of the post. Or, as Grist.org puts it, "record-breaking temperatures . . . are driving the fish away from fast-heating waters to more hospitable depths and latitudes."

The warming won't affect the appearance of mung seaweed on Cape Cod, at least not according to this National Park Service information sheet. That apparently drifts in from points farther north.

I am quite taken with the way the chart incorporates geographical information to show the movement of species. Do you agree?

Thursday

Oregon Health Study - first results

Here's a little more information about the Oregon Health Study's first published results, reported today in the New York Times. Unfortunately, the full article, in the New England Journal of Medicine, here, is behind a paywall. But here's the best take away, from the study's web site:

For uninsured low-income adults, Medicaid significantly increased the probability of being diagnosed with diabetes, though it had no statistically significant effect on measured blood pressure or cholesterol. Medicaid reduced observed rates of depression by 30 percent and increased self-reported mental health. Medicaid virtually eliminated out-of-pocket catastrophic medical expenditures, and increased use of physician services, prescription drugs, and hospitalizations.

This is not nothing, and those commenters who think it is are overstating. But as always, it's important to interpret statistical studies carefully. To its credit, the Times Economix Blog has a post
"What the Oregon Health Study Can't Tell" by reporter Annie Lowrey that does so:

Where it says something, it says a lot: it provides strong evidence that Medicaid recipients will spend more, use more tests, experience less depression, have fewer bills sent to collection agencies, and so on. It shows health insurance working just the way insurance is supposed to work: protecting the financial stability of the people purchasing it.

The biometric results are compelling, too. The authors chose a handful of conditions that were common, important, easy to test for and treatable to include in the study. Medicaid does not seem to do much to improve health outcomes related to those conditions in two years.

But there are many more questions that the Oregon Health Study simply cannot answer, despite the overheated rhetoric out there today. Does Medicaid improve health over a decade? What might Medicaid do for lifetime health costs? We do not know, even if the study provides some clues. Nor could this study answer the question of whether the Medicaid expansion will be “worth it,” and why. What study could?

You can find a roundup of responses here.

Tuesday

Global Warming Measurements, now on Twitter

Here's another way of looking at climate change: measuring the carbon dioxide (CO2) in the air. And that's what the Mauna Loa record, which has been kept since the late 1950s, does. It's also known as the Keeling Curve, after Charles David Keeling, who set up the program and directed it for many years.

Keeler was one of the first climate scientists to discover that the earth might behave with surprising regularity (given the right scale), and the long-term effort to measure atmospheric CO2 grew out of that insight.

The value of the Mauna Loa record soon became readily apparent. Within just a year or two, Charles David Keeling had shown that CO₂ underwent a regular seasonal cycle, reflecting the seasonal growth and decay of land plants in the northern hemisphere, as well as a regular long-term rise driven by the burning of fossil-fuels.

There's some fascinating history at the website of the Scripps Institute of Oceanography here, which was Keeling's academic home. And now Scripps is posting the daily Keeling Curve on Twitter. You can follow it here. What does it mean? Atmospheric carbon dioxide has been increasing steadily. You can see the measure is approaching 400 ppm - a threshold that will mean a different climate. You can read more here.

Friday

Hiatus week of April 22

I will be on a break from blogging next week with a last-minute large project and then some travel. In the meantime, if you haven't already seen it, make sure you read Paul Krugman's New York Times column about the "Excel depression." It's his take on the economic paper that concluded that once national debt exceeds 90 percent of gross domestic product economic growth drops off sharply. The claim gave some theoretical weight to the politicians who argued for economic austerity. Turns out the paper may have been, um, incorrect.

Yes, there was a coding error and, according to Krugman, they omitted some data and used 'questionable statistical procedures.' They've now released their data and original spreadsheet, which is how these errors came to light. You can see sections of the original spreadsheet here, if you're interested. But as is often the case, the issue was as much about how the original study was used and reported. As Krugman puts it:

[The] tipping-point claim was treated not as a disputed hypothesis but as unquestioned fact. For example, a Washington Post editorial earlier this year warned against any relaxation on the deficit front, because we are “dangerously near the 90 percent mark that economists regard as a threat to sustainable economic growth.” Notice the phrasing: “economists,” not “some economists,” let alone “some economists, vigorously disputed by other economists with equally good credentials,” which was the reality.

Read the full column. See you in a week.

Thursday

March weather, 2012 and 2013

March 2012 was unusually warm, and March 2013 unusually cold. The short-term explanation appears to lie in the Arctic Oscillation, as described in the NOAA video above. The cold March came at the tail end of a cold winter.

But there seems to be a longer-term possible explanation appearing as well. Remember climate and weather are two different things, and this video describes the weather. But yes, the climate is changing. And one of the effects appears to be on sudden stratospheric warming events like the one that occurred last January.

Sudden stratospheric warming events take place in about half of all Northern Hemisphere winters, and they have been occurring with increasing frequency during the past decade, possibly related to the loss of Arctic sea ice due to global warming. Arctic sea ice declined to its smallest extent on record in September 2012.

And yes, sudden stratospheric warming events can affect the Arctic Oscillation. You can read more about them here. As they say, you can expect the climate; the weather is what you get.

Update, May 6: There's an interesting interview with meteorologist Paul Huttner here. He talks about the unusual weather events we've been seeing, like last week's late snowstorm. And how that's a weather event, but there are lots of signs of regional climate change around the world.

Tuesday

Social media and disasters

There's a fascinating story in The Guardian's data blog about how social media have allowed information to be shared in the aftermath of the explosions at the finish of the Boston marathon yesterday. Social media played an important role as local sites set up accommodation offers as well as information. Twitter was very important - and I learned about a site, Trendsmap, that provides real time mapping of Twitter feeds (in all languages, not just English). And it's interactive - click on one of those links and the Twitter feed appears in a window:

It appears that the investigators have found a circuit board used to trigger the bombs.

ESRI, which I've written about before, does something similar, though not in real time.

David Brooks is not thinking straight about big data

David Brooks is doing the public no favors in his column today in which he suggests, among other things, that analysis of big data is devoid of human interpretation, bias, or judgment. (I am looking forward to reading "Big Data" by Viktor-Mayer Schonberger and Kenneth Cukier.) Leaving aside the headline, which Brooks may not have written, the commenters take the argument apart pretty well. I would just add one more thing: Mr. Brooks gave Jim Manzi's "Uncontrolled" a pretty big push last year. Has he forgotten what he said then? Big data is big data.

You can read my review of "Uncontrolled" here. And if you haven't read it yet, you should.

Monday

Critical reading, of charts

Here's a very good look from the Harvard Business Review (free after registration) at how different presentations of data can reflect different interpretations. It's another reminder of how important critical thinking and skepticism are to data you are given. As author Jake Porway states:

The most troubling part of all this is that "we the people" rarely have the skills to see how data is being twisted into each of these visualizations. We tend to treat data as "truth," as if it is immutable and only has one perspective to present. If someone uses data in a visualization, we are inclined to believe it. This myopia is not unlike imagining the red velvet cake we see in front of us to be the only thing that could have been created from the eggs and milk we mixed together to make it. We don't see in the finished product the many transformations and manipulations of the data that were involved, along with their inherent social, political, and technological biases.

Friday

"Data can be a source of creativity and art"

Shakespeare Machine Doc from Ben Rubin on Vimeo.

Last Friday I heard Mark Hansen of the Columbia Journalism School speak at a panel presentation on uses of data - that's a quote from his presentation in the post title. (You should click on his name just to see who funded the institute he heads.)

The Shakespeare Machine has 37 long screens, one for each of Shakespeare's plays. Hansen, a statistician, worked with artist Ben Rubin to develop algorithms that identify different word combinations. There might be a display of "you ___" words - "you gold, you king, you fool" appear on blade after blade. As Artnews describes it:

Each blade contains a whole play. Once a cycle, for about two minutes, the blade streams its play in its entirety. Then selections from its text will appear–terms selected for grammatical, contextual, rhythmic, or semantic attributes, like a verb followed by the word it, a noun phrase containing a part of the human body, and adjective-conjunction-adjective.

The Shakespeare Machine has been installed in the lobby of the Public Theatre in New York. So brush up your Shakespeare - and see if you can figure out which blade matches which play.

Thursday

That's a map of state-by-state obesity trends in the US, measured by BMI (body mass index). That's a pretty rapid shift, in the 25 years from 1986 to 2010. Further data from the CDC here. I've written about obesity before, here for example, but it's always good to see the data in a larger context.

Art Networks

The Museum of Modern Art, in New York, has an extremely lucid and accessible show "Inventing Abstraction" on view. It's full of fascinating and often tessellated or otherwise mathematical works of art. You can see some of them at the exhibition's website.

The network in the screenshot at the top of the post is reproduced at the entrance to the exhibition. It illustrates the extensive connections between the early abstract artists; those the most documented connections are illustrated in red. It's fascinating to look at - you'll identify patterns of geography as well as influence if you look closely. The online version is interactive. Click on Vaslav Nijinsky's name, for example, and you'll find connections to Claude Debussy, which is not surprising, and to Duncan Grant, a connection which was new to me. As was Nijinsky's art on paper, three pieces of which are included in the exhibition.

Tuesday

Social Impact Bonds in New York and elsewhere

I last wrote about social impact bonds back in August and its sister health impact bonds about a month later. The field is moving, and it's time for an update. According to news reports like this one, there are 14 social impact bonds (SIBs) issued or in development in the UK. The UK organization Allia, "The Social Profit Society," has issued what it calls a "Future for Children" bond. And New York's own SIB has moved from pilot stage to full implementation.

New York City's efforts focus on youth held at the City's jail at Riker's Island, where a bed costs $85,000 a year. Studies show that youth who have been in custody are likely to return to jail over the succeeding six years. It's obviously better for the youth not to return to jail, and there's a large possible savings in reducing their future days in jail. In New York, the SIB-funded program provides a specialized cognitive behavioral therapy program called Moral Reconation Therapy (MRT) to 16-18 year olds who are held at the City's jail on Rikers Island for four days or more.

You'll notice, from the screenshot illustrating the transactions among the players in New York City's SIB, that there are some differences from the simpler model I described in my earlier post. Evaluation is expensive, and the independent evaluator is funded outside the SIB. So is the work of the intermediary, MDRC. The other major difference is the funding mechanism itself: Goldman Sachs has loaned MDRC, the intermediary, $9.6 million. That loan is backed by a $7.2 million grant from the Bloomberg Family Foundation. If the program succeeds - its break-even point is a 10% reduction in future jail days - the investors profit, and there are large possible savings for taxpayers. If the program does not succeed, Goldman has not put all its money at risk.

Why this structure? Not all social programs are equally successful. The least risky are evidence-based programs, those that have been shown to be successful, at least for a well-defined population over a set amount of time. Other programs, like MRT are quasi-evidence based: the research results are mixed. (One reason MRT was chosen as the intervention is that it fits well into the jail's operations.) New, untested programs are the riskiest. Even tested programs can be risky: a SIB-funded program might require a scale of services that has never been used. The challenges of translating a program that works in one setting (the community) to another setting like a jail also increase the uncertainties.

A couple of other things to note. A lot of learning happened during the pilot period. Although there's no opting out of the prison-based program, a lot of kids weren't participating, and "we had to figure out why," says MDRC's David Butler. In jail, there may be a lot of reasons that have nothing to do with the program, such as administrative and punitive segregation, or programs cancelled due to various jail issues. As in many other social service programs, data collection is a challenge. MDRC staff member Timothy Rudd pointed out other uncertainties to the program as well. Any savings are not spread out evenly over all program years, but become more evident, if they exist, in later years. And while the evaluation will examine the first cohort of participants, those results will be extrapolated to five subsequent cohorts.

So this is a program to watch. I'll keep updating every six months or so, as we see what happens.

Tuesday

Friday

Thursday

Monday

Tuesday

Wednesday

Tuesday

Wednesday

Friday

Thursday

Monday

Thursday

Tuesday

Friday

Thursday

Tuesday

Friday

Wednesday

Tuesday

Monday

Thursday

Tuesday

Monday

Friday

Wednesday

Monday

Friday

Thursday

Tuesday

Friday

Thursday

Tuesday

Monday

Friday

Thursday

Tuesday

Blog Archive

Popular Posts