Tuesday 8 January 2019

Oh No-ah you didn't: Demographic behaviour and gender egalitarianism


Recently Noah Smith made a tweet seemingly questioning the relationship between demographic processes and feminism, drawn from cross country comparisons from Pew. I understand that this is a tweet, and so that there was not much room to expound a detailed explanation of a theory, but much of the claim seemed to rest on the idea that because the United States had higher levels of childbearing and earlier marriage rates than some countries which are stereotypically stricter in terms of gender norms, this somehow refuted the standard claim that increasing women's agency was associated with lower fertility and later marriage.



There are a number of issues with this supposed refutation, and I'll discuss these in a little detail in the following three sections.

Are cross national comparisons valid?

The claim that Noah seems to advance here is that because at one point in time the correlation between demographic outcomes and gender equity is not strongly negative, this refutes the idea that there is an effect here. This seems to fundamentally misunderstand the claim that demographers make: this is that changes over time will tend to decrease fertility rates within the same setting. Considering the stylised diagram, Noah is effectively looking along the green line, and using the variability between countries (red) in TFR to claim there is not correlation between demographic outcomes and egalitarianism.




Now it should be noted that the evidence when looking on a between country basis is somewhat mixed. However, using cross sectional comparisons to draw conclusions about trend over time, without accounting for the starting point (the variation in the country TFR in the example at the green line is solely due to variation at the start, since the slopes are parallel) gives a deeply misleading picture. In reality of course, we are unlikely to see such a neat picture as in the toy example- indeed there is no requirement for societies to transition to low fertility at the same rate with increasing cultural and gender changes under second demographic transitions model. As such, neither the existence of between country demographic variation, nor the fact that there is no homogenous pattern, necessarily refutes the point that gender equality would be associated and perhaps causally with decreases in fertility rates.

What is the model of change?

Another problem is that Noah is incredibly unspecific with his terminology, so we don't have a real logic of change model to deal with. One of the major transformations we have seen that would be widely accepted as a measure of gender equality would be the increase in female educational enrolment, as we can see in the UK and France in the figure below.




Now, this establishes that there is an increase in the indicator we are claiming is a measure of gender equity, how does this fit into our logic of change model for demographic outcomes? The figure below presents age specific fertility rates which begin when the woman leaves full time education





























Overall, we notice very little change between cohorts, once we take into account the termination of educational enrolment. This leads us to the conclusion that there is a strong association between the two to the extent that as much of 60% of fertility postponement is explained by our indicator of gender equality, namely female education. Therefore, once we actually operationalise gender equity beyond more than stereotypes, some reasonably strong relationships emerge

What is the hypothesised relationship?

The final problem with the claim is that gender equity will necessarily be associated with decreases in behaviours like fertility. This is not unreasonable at relatively early stages of the Second Demographic Transition: the difficulty in role combination between work and motherhood and educational enrolment (as we have seen already) would be expected to decrease the intensity of all demographic behaviours. However, the assumption that this relationship is monotonic has little basis in demographic theory. Indeed, once we get to a certain stage, gender equality is expected to increase fertility level: this has been proffered as an explanation for European fertility variation with more gender equal countries in Scandinavia benefiting from higher fertility rates, with more traditional countries in the South experiencing drag on fertility as conservative social norms clash with educational and labour market requirements for women. The underlying assumption that feminism will reduce fertility universally is assuming that a country is at an early stage of gender development and the second demographic transition: I am not sure that this hold for the dataset that Noah is using. 

Monday 7 January 2019

The Accuracy of Rapid Diagnostic Tests for Malaria [Reblogged]

I had the privilege of giving an interview to Thomas Locke from Fight Malaria. I'm reblogging the transcript, with the audio available via their iTunes and Spotify channels. Full technical details are available in the working paper of initial research from the Portsmouth Brawijaya Working Paper Series (Number 8): Field interviewers effects on the quality of malaria diagnosis in Malawi published in conjunction with Dr. Ngania Kandala and Prof. Saseendran Pallikadavath.

The original is hosted on the Fight Malaria blog

Hello, I’m Thomas Locke and this is Five Minutes, the podcast that brings you closer to the people fighting malaria.
Today I’m joined by Dr Mark Amos to discuss the accuracy of malaria testing. So, how accurate are Rapid Diagnostic Tests, or RTDs, tools that are becoming increasingly popular? And how do they compare to traditional lab testing?
This is Five Minutes with Dr Mark Amos.
Mark thanks for joining me.
Thank you very much.
Talk me through how you’re assessing the accuracy of malaria testing.
This is essentially a validity study. We’ve been comparing two methods of diagnosis. We’ve been comparing field diagnostic methods with lab-based methods. And we’re assuming that lab-based methods here are the gold standard against which we’re judging the accuracy both in terms of false negatives and false positives for testing administered in the field.
Presumably, field tests are more accessible than lab testing?
They are, there are a number of advantages to field testing. It’s very rapid so you can give a diagnosis to the potential patient straight away. You can also carry it with you so you don’t need people to turn up to a lab or worry about transport. The concern is that it is being tested in an uncontrolled environment, which is why we wanted to look at the accuracy of the testing mechanisms available.
How did you conduct this research and what were the outcomes?
Our research was using Demographic and Health Survey data which is available from the DHS website. We found that there was a reasonable degree of accuracy in most cases, but there was some variation in the degree of accuracy and testing depending on the interview team. Now, for false positives, this was actually around about 15% of variation in the level of false positives was attributable to the interview team. However, we did actually find for false negatives, that around 48% of variation in the rate of false negatives was attributable to the interview team. Now, obviously false positives are reasonably serious, Malawi is resource-poor setting. We prefer not to have false positives if it’s all possible, it’s a bit of a waste of resources. However, probably false negatives are, in a sense, the more serious false diagnosis, you’re potentially saying to someone who does have malaria that they’re okay. That has a number of implications, it may delay them seeking care or treatment for malaria or it may actually put more seeking care treatment for malaria at all.
What sort of data did you collect?
So the data we’re using is actually secondary data. It was collected by the DHS program in Malawi and we access their data remotely. We have two measures within the dataset: the diagnosis from the field test and the diagnosis from a lab-based result, and we simply compared the two.
The DHS, is that state-owned by the Malawian government, is it USAID, who owns the data?
It’s funded by USAID, it’s a survey that has been running for a number of decades, it started off as the world fertility survey. It’s a series of cross-sectional surveys in all areas, where there are reasonably high levels of fertility. It captures a number of dimensions of data that might be useful ranging from contraceptive use, maternal health care utilization, to things like AIDS prevalence or the prevalence of malaria.
You found out that there is some degree of accuracy with both methods of testing. What are your next steps?
There are two major steps. Firstly, although there is some variability in false diagnosis, in a way the fact that we’re attributing this to interviews is actually a reasonably positive step forward, it is actually something we can do about it. It’s not a function of the underlying test, it’s a function of the interview a team administering the test. So there might be potential ways of looking at better training or encouraging interviews to deliver tests in a different way, which might actually increase the accuracy so there’s a kind of a positive policy story type there. The other major thing that we want to do going forward is we want to expand our analysis to look at, hopefully, all of the demographic and health survey datasets across sub-Saharan Africa. So we’ll be able to compare between countries and see sort of whether there is something unusual about Malawi or whether this is a generalizable finding.
Dr Mark Amos, thank you.
Thank you very much.


Monday 8 October 2018

A peer review of the Sokal hoax 2


There was a piece published recently that claimed to be an academic grievance study, 'exposing' corruption in sociological fields of publication with gathered some attention in The Economist and The Atlantic for example. The full essay can be found here. I'm not going to discuss the intent of the piece here: these sort of things always involve a certain abuse of faith from peer reviewers, and the claim from one of the masturbation/rape reviewers is relevant here in that an inexperienced reviewer was attempting to do the right thing in providing constructive criticism, although I do accept that structural factors could pressure genuine reviewers in a negative direction as the authors of the hoax would claim.

The major problem I wish to discuss is the appalling manner in which this hoax was conducted.

1. Misreporting

Lindsay et al. make repeated reference to the fact that they had 7/20 papers accepted for publication. However, when we examine the actual project notes that are cited in their original article, it emerges that they in fact made 48 new submission, and have mangled their denominator by excluding submissions where the paper was rejected before resubmitting them. There is therefore a severe upward bias in their measure of success- the success rate plunges from 35% to 14%.

 2. Lack of a control group

The lack of a control group within this study limits our ability to actually draw much inference about the success rate of the submissions. The authors claim that they are using the overall submission success rate as a 'meta-control' but this is clearly inadequate- there is an obvious and substantive difference between papers that are designed to be manipulated to maximise the probability of success without the baggage of being based on real data, and the median work of real scholarship. It's also a leap to the idea that Lindsay et al. were matched in terms of ability, writing quality to the median submitter of these journals; if Lindsay et al. had submitted some real work we could have differenced out this confounder. But they didn't, so we can't, and a potential bias remains.

3. Methodological inconsistency

Lindasy et al. changed their method throughout the course of the study. This is certainly a major red flag in a lot of experimental social science for the reason that Linday et al. state in their article: they changed their method because their initial attempts were not getting the results that they desired. If we look at their initial claim that it would be relatively easy to get a paper published regardless if it simply sounded right, this seems to have been totally falsified. None of their initial papers went for full peer review. It was only after they changed their approach that they met with any success whatsoever, reporting

This shift in success rate followed a commitment to understand the field in greater depth that initiated in late November 2017 and progressed through April 2018, by which time we felt we had become sufficiently competent.
4. Subjective interpretation

Lindsay et al. claim that much of their data should have raised alarm bells because it was highly implausible. They do not however, present statistics such that this can be verified and we have only their word that these figures should be obviously incorrect. Indeed, the specific paper which they highly as having implausible statistics ("Dog Park") was the paper for which other academics began requesting field work so concerned were they about the paper. The paper has been retracted and as such I cannot see any figures claimed, but if this were a true scholarly work, there would need to be some more empirical backing of the claim that such figures were implausible, rather than the simple assurance of a party with vested interest.

In light of all of these limitations, I'm not sure that we should take the results of this particular grievance study conducted too seriously. This is not to claim that there are not issues in scientific publishing, and in social science publishing, of course there are areas that need to be improved. However, this attempt was not rigorous, and distracts attention from actual scholarly analysis with deeply concerning findings and could falsely lead us to the conclusion that problems are exaggerated or limited to very niche fields, whereas I suspect they are probably more common although less headline grabbing.

Addendum: As of 12/01/2019, one of the authors has failed an internal review for this study, for failing to seek ethical approval for the deception of peer reviewers. 

Wednesday 23 May 2018

Enforced monogamy and the World We Have Lost

Jordan B Peterson garnered some attention earlier this by making some controversial claims about the role of partnership norms in society. Specifically, the sentence

The cure for that is enforced monogamy. That’s actually why monogamy emerges

drew some ire when referring to potential motivations for the recent Toronto killings. This drew a lot of attention and some Handmade Tale-esque comments, which I think was perhaps a little uncharitable, and indeed Peterson later clarified that he mean social expectations of union forms on his website (although the major source he cites is someone else's post hoc opinion on reddit, which is not the height of academic referencing). Nonetheless, if we take this explanation at face value, and possibly concede relationships during the Golden Age of Marriage were Chesterton's fence with regard to social stability - the relativity of the causal effect of marriage is probably more complicated than that. While family formation norms historically were perhaps more homogeneous, there is a deal more nuance to the way that families formed than Peterson is actually letting on here.

Relationship formation

The first stage of the enforced monogamy type model would be a relatively early union formation, enforce by some sort of contract such as marriage. This basic claim is somewhat valid, indeed examining the figure below we can see a shift from early a direct marriage (where marriage would have acted as a gatekeeper to sexual activity, although there are suggestions that this is not as important as Peterson supporters would claim, at least from a female perspective). We can see at least, in a subset of countries drawn from the Harmonized Histories dataset, that there was a clear retreat from direct marriage (left column, upper panel) which predominated during for earlier cohorts born 1945-54 (left colum, lower panel) in favour of a delayed marriage preceded by cohabitation (right column, upper panel) which emerged for later cohorts born 1965-74 (right column, lower panel). There is also some evidence that even where this premarital cohabitation leads to marriage the number of serial monogamous relationships is tending to increase prior to a final marital union.




Relationship dissolution

Where Dr. Peterson makes the most fundamental error is in believing that these primarily marital relationships marked anything near universal and enforced monogamy that prevented union dissolution. Indeed, in the figure above indicates, there is a small but non-negligible proportion of relationships that ended in divorce (red bar) even where this activity was heavily socially stigmatised. The other point worth making- I've made it before- is that there was a cultural movement away from lifelong monogamy, the pent up demand for legal recourse to end relationships is clear, expressed by the spike in divorces in 1967.


Source: Office for National Statistics:Divorces in England and Wales: 2014

The other point that we should bear in mind is that the relationship between union dissolution norms and the context within which they are occurring is somewhat reciprocal. This is best explained by Liefbroer and Dourleijn, who examine the changing effects of selection on the rate of marital dissolution, and the effect of cohabitation prior to marriage. The extent to which living with a partner tend to weaken the subsequent union has a history of examination in demography, and there is a fair degree of selection going on- that is the lack of a legally enforced marital union may not be responsible for increased union dissolution so much as being a function of people entering those unions. The major point that Liefbrouer and Dourleijn make is that this selection is likely to vary over time and context: where direct marriage is near universal, merely living with a partner is selected only for the most unstable types of union and hence is more likely to break up. As marriage becomes more common however, this selection reverses so that the institutionalisation of premarital cohabitation (or cohabitation in general) mean that it is only very extreme cases where people will tend to marry directly. While there is probably some causal effect (this is reflected in the figure below), a large proportion of many outcomes for marriage tend to be explained by selection- for instance within the context of both mental health and health in mid life. The effect of marriage therefore is highly dependent on the social context in which it occurs, and it's direct effect limited even then.

Source: Liefbrouer and Dourleijn (2006) pp. 217

It's difficult therefore to really see a realistic means by which such a social enforced monogamy will be useful as a policy intervention: universal marriage- such as it existed- was only really able to act as an enforcer because it existed within a certain social context. The genie is out of the bottle in that respect: social norms have moved on and it's clear that legal institutions have caught up with cultural mores, rather than the other way around. Indeed, there is evidence that in making divorce harder, you may actually depress the type of marital union you are seeking to encourage. It's fine to claim that the social and marital patterns seen in the Golden Age of Marriage are a result of complex processes: it's contradictory to then argue that transplanting those marital patterns to the current social context would have easily predictable or indeed comparable effects on relationship outcomes. 

Friday 4 May 2018

Welfare receipt and demography (part 2)

There is an interesting paper in Review of Economics of the Household examining the effect of welfare receipt on union formation by Michelmore. I've blogged previously about the effect of welfare receipt on demographic behaviour, essentially coming to the conclusion that the research there didn't really find what it though it did: timing effects confounded the suppose impact of welfare receipt on fertility.

The Michelmore paper, however, has a much stronger methodology and is interesting in its approach. The research question is fundamentally addressing whether receipt of the Earned Income Tax Credit (and in work welfare payment in the United States) was associated with changing the propensity of women to marry- the argument being that married couples would face severe welfare cliff (see figure 1) since the EITC is paid based on household income



Figure 1: Phase in, plateau and phase out of EITC

To examine the effect of this, identifying the effect is somewhat tricky. There is a lot of missing data- where couples do not coreside we have missing information on the potential spousal income which is important to the potential EITC loss on marriage. It should also be noted that there will be a fair amount of endogeneity- labour market behaviour will be partly determined by relationship status which muddies the relationship if we want to look at marriage as an outcome. The author takes a simulation approach: statistically constructing marriage market based on various important characteristics (race, education, etc.) and then randomly paring recipients with potential spouses. Based on that, we can then get somewhere toward looking at the effect of welfare receipt- or more precisely the threat of loss of welfare- on relationship formation.

In the majority of cases, marriage will tend to result in a loss of welfare receipt (figure 2), and these are relatively disadvantaged women these losses can be quite substantial. As a result, we see an effect of whether a woman is likely to marry her spouse: in the presence of controls, women are 3.1%pts less likely to marry and 2.3% pts more likely to cohabit.


Figure 2: Expected changed in EITC on marriage
More positive values indicate greater loss

There are some positive elements to take away from this: women are partnering and there does at least seem to be engagement with a welfare scheme designed to alleviate poverty. That said, is it concerning that welfare receipt seems to be influencing relationship choices: marriage compared to cohabitation is associated with a raft of advantageous outcome such as union stability with knock on effects on poverty, relationship churn and abuse. While there is some decent evidence that a fait chunk of this is due to selection effects, there is generally some sort of small persistent causal advantage. While I'm not sure that encouraging marriage using welfare or taxes would be a cost efficient policy given this effect size, we should certainly revisit welfare policies which discourage marriage and wipe out the slight benefit this would provide to a disadvantaged social group.

Wednesday 2 May 2018

Where I Marvel at a villan's bad demography

This post is not dealing with a research paper, or anything serious, but largely because I went to the cinema at the weekend. As such there are spoilers about Avengers: Infinity War to follow, you have been warned.

The relevant part of the film is the overall plan of the antagonist, Thanos. The problem with the universe as he sees it is a classic malthusian trap: population is growing too rapidly and will eventually consume all resources. His proposed solution it to reduce the population by 50% on a random basis, which he claims will fulfill two key goals:

1. The population will now be sustainable and not consume resources preventing the malthusian trap (sustainability goal)

2. This will be carried out justly since the randomness in population reduction will ensure no targeting of a certain group (equity goal)

His plan fails in both on these two key criteria

Sustainbility goal

Thanos' plan is fundamentally flawed on the ground that he had merely altered the stock of the universal population: the underlying fundamentals are unchanged (the Salarians in Mass Effect understood this for instance). The graph below plays out this example on Earth (this is the only planet I have data for from the CIA World Fact book). Using an exponential growth model, I'm making the following assumptions

World population 2018: 7.3 billion
World population growth rate: 1.6% pa.
World carrying capacity: 10 billion

You can see the two scenarios play out. Without Thanos (orange), the exponential growth model predicts hitting carrying capacity (red dashed line) at around 2038. With Thanos (blue), there is an initial shock to the level of the population, but the population eventually recovers, and indeed hits carrying capacity around about 2082. So while Thanos has bought around 44 years, we will still see population overshoot within my lifetime, let alone his. 

Projected world population with and without Thanos' intervention

Equity goal

Thanos also claimed that the randomness of removing 50% of the population was equitable in its impact. The problem here is that at a global (and presumably galactic level) there are variations in the underlying growth rates of different sub groups. Where population growth rates are low or negative, Thanos risks wiping out certain groups by demographic methods alone. The figure below projects the effect on Moldova, a small country in the Balkans that I have an affinity for one reason or another. Moldova currently has negative population growth due to both low fertility and economically motivated outmigration. Again, using an exponential model, we can see that without Thanos Moldova would drop to very low population levels within a few hundred years, eventually reaching extinction (fewer than one Moldovan) in around about 3415. With the loss of 50% of the population, the same rate of decline leads to a reduction in population occurring much sooner with extinction in 3350- some 65 years earlier. 

Projected Moldovan population with and without Thanos' intervention


It should be noted as well that fertility rates are likely to be correlated with a number of other factors- low rates of child bearing are more common among groups with higher education or greater wealth. Thus, Thanos' equity goal is not fulfilled: the removal of 50% of a population at random would lead to a redistribution of population at a galactic level to reflect planets with higher growth rates, and poorer planets yet to experience the full demographic transition. 

Tuesday 27 February 2018

*How* does fertility predict recession?


There has been interest this week in the relationship between fertility behaviour and economic cycles, as picked up by the BBC and Financial Times, based on a NBER working paper by Buckles et al: Is Fertility a Leading Economic Indicator?

The answer essentially seems to be yes- for certain recessions like 2008 at least- there is evidence of a lagged effect of conception rates on economic slumps. Moreover, this is not an effect of increased rates of induced abortion- it does seem like this is an increase in the desire to reduce (or postpone) childbearing in response to economic turmoil.

Now, as full disclosure, this is somewhat inconvenient to me based on my previous arguments about our ability to use events like the Great Recession for causal claims. In particular, this tends to skewer the idea events like the recession were unanticipated- clearly if people are altering their behaviour prior to the event, there is some degree of anticipation.

So how do we explain this predictive effect? The basic model in Buckles et al. is essentially testing for the significance of lagged conception rates on GDP growth at an aggregate level. The most natural explanation would be through employment: spikes in unemployment are associated with a depressing effect on fertility on an aggregate level (although interestingly the effect is more mixed for females, both in Germany and the UK poor employment prospects for women tend to provoke motherhood). However, Buckles et al. find that falls in fertility generally tend to precede rising rates of unemployment- and that the large rises in unemployment are not manifested until after the recession hits. This does not therefore explain the relationship between fertility predicting recession

One interesting avenue is more general measures of economic confidence- Buckles et al. test this with measures of consumer confidence, purchases of consumer durable and housing stock purchases. In general, the anticipatory effects are weaker (although not eliminated) here, will fall in fertility coinciding with declines in these measures of confidence.



Source: Buckles  et al., pp. 36

This does seem to have some legs, and has somewhat been replicated in other research. Comolli (2017) for instance find an association between search terms for 'spread' (within the context of Italian Sovereign debt crises) and subsequent birth rates. That said, Buckles et al. do find that the decline in consumer confidence follows the fall in fertility, so we aren't quite at a full explanation.

To reiterate the points of some of my work on this area, even when we build a reasonable model of what we think should explain fertility behaviour from an economic perspective (including labour market status, wages, wealth stocks, local economic conditions) the persistent significance of crude dummies for pre- and post-recession birth rates indicates that there is something else at play which we can't quite explain: fertility animal spirits. Fertility within the context of macro-economic conditions remains a somewhat poorly understood phenomenon- it's not perhaps a surprise that we can't explain this predictive relationship if we don't have a fully fleshed out model in the first place.