One of the major attractions of Scheveningen (if you can pronounce that you've successfully adapted to Dutch culture) is a 360 degrees painting by the Dutch painter Hendrik Willem Mesdag. It depicts the North Sea coast near Scheveningen in the nineteenth century, long before its neighbouring city, The Hague, absorbed this coastal fishing village in one big agglomeration. Mesdag created an illusion that worked surprisingly well: there appears to be depth in the painting and you feel like standing on a dune watching over the beach, or looking down on the village with its neat little houses, or the villas where rich city folk spent their free time. What is also striking is the dominance of fishing, together with transport, in the coastal zone. You see some sunbathers, but they are easily outnumbered by fishers and other workers in the fishery, such as the horsemen towing the bomschuiten (flat-bottomed fishing vessels, a bit like the pink).
How different is it nowadays. International trade has mushroomed. We have largely replaced sails and steam engines by combustion engines running on oil and gas, scattering drilling platforms all over the North Sea to get to the stuff. Wind is making a come-back as wind turbines are forming entire forests in the open sea. Meanwhile, fishing has become something to limit rather than promote: in Mesdag's days the British scientist Thomas Henry Huxley called fishery resources "inexhaustible", but for numerous stocks we have actually found those limits and are now concerned about crossing them. And we're not only concerned for edible species, but also for marine life in general: enter marine protected areas.
So many uses, so many users, so little resource
Like the North Sea, many marine and coastal ecosystems have many different uses, many different users, and many different ways to meet the users' needs. Mangrove forests provide coastal protection, a nursery ground for wild fish, a source of juvenile shrimp for extensive shrimp farming systems, and a fascinating ecosystem to float through for tourists. Likewise, other coastal ecosystems like mudflats and coral reefs provide a variety of goods and services to a variety of users. And none of these biomes are limitless.
Given this variety of uses it is not surprising that policy makers need to make many tradeoffs. How far are we willing to limit fishing for an extra gigawatt of wind energy? How do we trade off port capacity against tourism? Does the income generated by an extra hectare of intensive shrimp aquaculture offset the loss in biodiversity and coastal protection?
All these examples are tradeoffs between uses, but also within one and the same use policy makers have to make difficult choices. What is worse, a small flood every year or a big flood every ten years? How do we rebuild fish stocks if local communities depend so much on fishing that they cannot miss a single year of it?
Note that simply putting a price tag on services may not be enough: the average per hectare value of a mangrove forest may be low when the forest is large, but once we have cut most of it the last few remaining hectares will be much more valuable. Moreover, aggregating monetary values over all stakeholders and over time may give you a single figure (the net present value), but this simplicity obscures problems of poverty and income distribution. So we may need to consider the entire tradeoff.
Tradeoff analyses and bioeconomic modelling
I have done tradeoff analyses of dairy farming and biodiversity conservation in my PhD thesis, and I recently submitted a paper with a former MSc student of ours, Matteo Zavalloni, and fisheries ecologist Paul van Zwieten where we analyze the tradeoff between shrimp aquaculture and mangrove conservation in a coastal area in Viet Nam. Both analyses are spatially explicit, i.e. we analyze not only how much of something can or should be done, but also where. The "where" question is quite important as many uses of marine areas (shipping, fishing, aquaculture) have a spatial dimension.
So this will be one of my major focus points: developing tools to make quantitative tradeoff analyses of coastal and marine ecosystems. I'm very much a bioeconomic modeller. I guess it's the geek in me: I've always been terrible at practical technical stuff (the holes my house's walls and the crappy paint jobs on my window panes bear witness to that), but I enjoy the patient development of a complicated quantitative model, or an insightful analytical model. I also enjoy the interdisciplinary nature of this work: you need to collaborate intensively with other scientists, mainly ecologists, to do it right.
zaterdag 29 december 2012
vrijdag 7 december 2012
The Stapel affair: it is worse than we thought
After Diederik Stapel was caught cooking the scientific books, three committees investigated the extent of the fraud in their universities (Amsterdam, Groningen, Tilburg), and how it was possible that Stapel committed his fraud on such a massive scale. The report came out last week, and I find its content no less than shocking. And then I'm not just referring to what they found Stapel did, or how the universities where he did it never suspected anything. What shocked me most was the conduct of the other researchers. Worse even, many admitted to these practices without the slightest notion they were doing something wrong.
Repeat the experiments until you get the results you want
Suppose your hypothesis says that X leads to Y. You divide your test subjects into two groups: a group that gets the X treatment and a control group that gets no treatment. If your hypothesis is correct the treatment group should show Y more often than the control group. But how can you be sure the difference is not a coincidence? The problem is that you can never be certain of that, so the difference should be so large that a coincidence is very unlikely. Statisticians express this through the 'P-value': if your hypothesis is not true, the probability that you get these results is estimated by the P-value. In general scientists are satisfied if this P-value is lower than 5%. Note that this means that if the hypothesis were not true, you still have a 1 in 20 chance of getting results that suggest it is!
So here is the problem. Some of the interviewees in the Stapel investigation argued it is perfectly normal to do several experiments until you find an effect large enough for a P-value lower than 5%. Once you have found such a result, you report the experiment that gave you this result and ignore the other experiments. The problem here is that any difference you find can be due to coincidence. If you do two experiments, you have a chance of about 1 in 10 that at least one of them gives a P-value lower than 5% if the hypothesis is not true; if you do three experiments, the chance is about 1 in 7. This strategy must have given a lot of false positives.
Select the control group you want
No significant difference between the treatment group and the control group in this experiment? No sweat, you still have data on the control group in an experiment you did last year. After all, they are all random groups, aren't they? So you simply select the control group that gives the difference you were looking for. Another recipe for false positives.
Keep mum about what you did not find
Another variety is that you had three hypotheses you wanted to test, but only two are confirmed (ok, technically hypotheses are not confirmed - you merely reject their negation). So what do you do? You simply pretend that you wanted to test these two all along and ignore the third one.
Select your outliers strategically
Suppose one of your test subjects scores extremely low or high on a variable: this person could be an exception who cannot be compared to the rest of your sample. For instance, somebody scores very high on some performance test, and when you check who it is it turns out that this person has done the test before. This is a good reason to remove this observation from your dataset because you are comparing this person to people who do the test for the first time. However, two things are important here: (1) you should explain that you excluded this observation, and why; and (2) you should do this regardless of its effect on the significance of your results. It turned out that many interviewees (1) did not report such exclusions in their publications; and (2) would only exclude an observation if doing so would make their results 'confirm' their hypothesis.
And all this seemed perfectly normal to some
But as I said earlier, the most troubling observation is that the interviewees had no idea that they were doing anything wrong. They said that these practices are perfectly normal in their field - in fact, in one occasion even the anonymous reviewer of an article requested that some results be removed from the article because they did not confirm their hypothesis!
The overall picture emerges of a culture where research is done not to test hypotheses, but to confirm them. Roos Vonk, a Dutch professor who, just before the whole fraud came out, had announced 'results' from an experiment with Stapel 'showing' that people who eat meat are more likely to show antisocial behaviour, argued on Dutch television that an experiment has "failed" if it does not confirm your hypothesis. It all reeks of a culture where the open-minded view of the curious researcher is traded for narrow-minded tunnel vision.
Don't get me wrong here: the committee emphasizes (as any scientist should) that their sample was too small and too selective to draw any conclusions about the field of social psychology as a whole. Nevertheless, the fact that the committee observed this among several interviewees is troubling.
But the journals are also to blame, and there we come to a problem which I am sure is present in many fields, including economics. Have a sexy hypothesis? If your research confirms it the reviewers and the editor will crawl purring at your feet. If your research does not confirm it they will call your hypothesis far-fetched, the experimental set-up flawed, and the results boring. It's the confirmed result that gets all the attention - and that makes for a huge bias in the overall scientific literature.
Repeat the experiments until you get the results you want
Suppose your hypothesis says that X leads to Y. You divide your test subjects into two groups: a group that gets the X treatment and a control group that gets no treatment. If your hypothesis is correct the treatment group should show Y more often than the control group. But how can you be sure the difference is not a coincidence? The problem is that you can never be certain of that, so the difference should be so large that a coincidence is very unlikely. Statisticians express this through the 'P-value': if your hypothesis is not true, the probability that you get these results is estimated by the P-value. In general scientists are satisfied if this P-value is lower than 5%. Note that this means that if the hypothesis were not true, you still have a 1 in 20 chance of getting results that suggest it is!
So here is the problem. Some of the interviewees in the Stapel investigation argued it is perfectly normal to do several experiments until you find an effect large enough for a P-value lower than 5%. Once you have found such a result, you report the experiment that gave you this result and ignore the other experiments. The problem here is that any difference you find can be due to coincidence. If you do two experiments, you have a chance of about 1 in 10 that at least one of them gives a P-value lower than 5% if the hypothesis is not true; if you do three experiments, the chance is about 1 in 7. This strategy must have given a lot of false positives.
Select the control group you want
No significant difference between the treatment group and the control group in this experiment? No sweat, you still have data on the control group in an experiment you did last year. After all, they are all random groups, aren't they? So you simply select the control group that gives the difference you were looking for. Another recipe for false positives.
Keep mum about what you did not find
Another variety is that you had three hypotheses you wanted to test, but only two are confirmed (ok, technically hypotheses are not confirmed - you merely reject their negation). So what do you do? You simply pretend that you wanted to test these two all along and ignore the third one.
Select your outliers strategically
Suppose one of your test subjects scores extremely low or high on a variable: this person could be an exception who cannot be compared to the rest of your sample. For instance, somebody scores very high on some performance test, and when you check who it is it turns out that this person has done the test before. This is a good reason to remove this observation from your dataset because you are comparing this person to people who do the test for the first time. However, two things are important here: (1) you should explain that you excluded this observation, and why; and (2) you should do this regardless of its effect on the significance of your results. It turned out that many interviewees (1) did not report such exclusions in their publications; and (2) would only exclude an observation if doing so would make their results 'confirm' their hypothesis.
And all this seemed perfectly normal to some
But as I said earlier, the most troubling observation is that the interviewees had no idea that they were doing anything wrong. They said that these practices are perfectly normal in their field - in fact, in one occasion even the anonymous reviewer of an article requested that some results be removed from the article because they did not confirm their hypothesis!
The overall picture emerges of a culture where research is done not to test hypotheses, but to confirm them. Roos Vonk, a Dutch professor who, just before the whole fraud came out, had announced 'results' from an experiment with Stapel 'showing' that people who eat meat are more likely to show antisocial behaviour, argued on Dutch television that an experiment has "failed" if it does not confirm your hypothesis. It all reeks of a culture where the open-minded view of the curious researcher is traded for narrow-minded tunnel vision.
Don't get me wrong here: the committee emphasizes (as any scientist should) that their sample was too small and too selective to draw any conclusions about the field of social psychology as a whole. Nevertheless, the fact that the committee observed this among several interviewees is troubling.
But the journals are also to blame, and there we come to a problem which I am sure is present in many fields, including economics. Have a sexy hypothesis? If your research confirms it the reviewers and the editor will crawl purring at your feet. If your research does not confirm it they will call your hypothesis far-fetched, the experimental set-up flawed, and the results boring. It's the confirmed result that gets all the attention - and that makes for a huge bias in the overall scientific literature.
Abonneren op:
Posts (Atom)