Friday 31 March 2017

Research publication: Latent class models for cross-national comparisons

Latent class models for cross-national comparisons: The association between individual & national-level characteristics in fertility & partnership was published today in the International Journal of Population Studies. 


Abstract: Multilevel modelling techniques such as random models or fixed effect are increasingly used in social sciences and demography to both account for clustering within higher level aggregations and evaluate the interaction between individual and contextual information. While this is justifiable in some studies, the extension of multilevel models to national level analysis — and particularly cross-national comparative analysis — is problematic and can hamper the understanding of the interplay between individual and country level characteristics. This paper proposes an alternative approach, which allocates countries to classes based on economic, labour market and policy characteristics. Classes influence the profiles of three key demographic behaviours at a sub-national level: marriage, cohabitation and first birth timing. Woman level data are drawn from a subset of the Harmonized Histories dataset, and national level information from the GGP contextual database. In this example, three country classes are extracted reflecting two Western patterns and an Eastern pattern, divided approximately along the Hajnal line. While Western countries tend to exhibit higher levels of family allowances albeit accounting for a lower share of spending which is associated with lower marriage and later fertility, Eastern countries generally show a higher share of spending but at lower absolute levels with lower cohabitation rates and early fertility

Link to the full paper can be found here. This is open access, but I've included some thoughts below as well.

This paper is somewhat related to previous grumbles about the way in which demographic behaviour and institutions are modelled, or at least talked about. For full disclosure, this paper does not deal with the sort of endogeneity problems that I have talked about before: it does however I think provide a more coherant framework for analysing the interaction between national level policy and individual level demographic behaviour.

One of the major innovations in statistics has been the use of multilevel models. Random effects type models will typically deal with some degree of dependence of individual level outcomes on the context within which they are located. Examples of this can be found in schooling: there is variation in pupil level outcomes which is not explained by standard controls such as receipt of Free School Meals (poverty measure), SEN, ethnicity, etc.. Multilevel models are able to partition into individual (attributable to the pupil) and school level variation (This, as I understand it, is the basis of the Contextual Value Added measure in UK school league tables: the CVA is the school level residual in a random effects multilevel model). This is incorporated as an additional error term (often with a Normal distribution for convenience). While this works well for schools (there are enough of them that the school level errors can approximate a Normal distribution), whether this applies to countries is more questionable. It takes a lot of work to get even a dozen or so countries in a form where they can be modelled together (for instance see the excellent Harmonized Histories); but we are still below the sort of numbers where we can reasonably expect the CLT to apply. More problematic is the consistency between the model assumptions and the application: purposive selection of countries for analysis seems strongly at odds that our observations follow the sort of stochastic assumptions required for the model, the endogeneity problems seem to contradict iid assumptions of the residuals. In sum, the application of random effects models seems like too serious a violation of Gauss-Markov assumptions to be tolerable: obviously all models are wrong but there needs to be some sort of consistency between our theory and estimated models. We're not Milton Friedman. The other problem is that since we often want to make statements about cross national variation, the fact that this variation of interest is now subsumed into an error term tends to mean that actually interpreting variation between countries is tough

Fixed effects models are slightly better in that they require fewer assumptions of that nature, but still don't get all of the to addressing the sort of problems we might have. Usual concerns about statistical efficiency are probably not relevant here: the small number of countries probably means that there are not many efficiency gains to be made including a variance term over 7 dummies. However, is the fact that if we want to make statements about the effect of national level policy, the policy indicator is now fully confounded by the country fixed effect, so we have an identification problem. Unless we are including the fixed effect just to remove noise (so why are we taking a multilevel approach anyway?) this is a major research drawback.

The proposed solution is something of a hybrid: I use national level policy indicators to form latent classes (shown below) and then model demographic behaviours dependent on the country level class. 



Lyons-Amos (2016)

There are a number of advantages to this approach: we aren't purely tied to the reasonably strict assumption of the random effects model: the latent classes hare discrete to we can account for clumpiness in the 'residual.' Also, these classes are qualitatively interpretable: the use of policy indicators means that we can see what the class actually means. 


Table 3: Estimated mean levels of country level indicators by latent class 
Indicator (Number of country members)
Class 1: Eastern Europe (4)
Class 2: Western Europe lesser support (2)
Class : Western Europe higher support (4)
Family support



Value of family allowance (PPP adjusted 2005 $)
82.21
92.26
133.0
Family allowance (% of GDP)
1.38
0.11
1.78
Social expenditure (% of GDP)
13.16
26.8
26.13
Public expenditure on childcare (% of GDP)
0.38
0.64
0.38
Ease of childcare



Female labour force participation (%)
64.40
73.40
70.46
School entry age
3.25
4.00
3.00
Legal status of cohabitation



Cohabitation mentioned (%)
26.7
26.0
29.7
Legal equivalence (%)
30.7
37.5
34.5
Legally recognised (prob)
0.00
0.50
0.99
                                                                                                                                               
Lyons-Amos (2016)

Naturally there are some limitations in what we are able to do with these models, which I talk about in the paper: this is means to be a step on the way, not a complete solution. Whether this application is relevant as well remains to be seen- the added difficulty in estimating the latent class model will be a turn off for many. However, models need to be as simple as possible, but no simpler. I'm not sure whether for many demographers simply running MLwiN or xtmixed adequately addressed the latter concern. 

Wednesday 15 March 2017

Inclusive institutions are endogenous institutions


This is post is a somewhat fleshed out version of a slightly snarky tweet based on an article published in Demographic Research. Before we proceed, I should say that this is not intended as a criticism of the article directly: the article is an example of somewhat wider trend in the demographic literature.

Equality at home - A question of career? Housework, norms, and policies in a European comparative perspective
Abstract
Background: Dual-earner families are widespread in contemporary Europe, yet the division of housework is highly gendered, with women still bearing the lion’s share. However, women in dual-career couples and in other types of non-traditional couples, across and within different European countries, appear to handle the division of housework differently.
Objective: The objective of this study is to examine the division of housework among various couple-earner types, by determining i) whether relative resources, time spent on paid work, gender attitudes, and family structure reduce variations in housework between different couple types, and ii) whether the division of housework varies between countries with different work‒family policies and gender norms.
Methods: The study uses data from ten countries, representing different welfare regime types, extracted from the European Social Survey (2010/11), and employs multivariate regressions and aggregated analysis of the association between the division of housework and the contextual indices.
Results: The results show that dual-career couples divide housework more equally than dual-earner couples, relating more to the fact that the former group of women do less housework in general, rather than that men are doing more. The cross-national analysis shows tangible differences between dual-earner and dual-career couples; however, the difference is less marked with respect to the division of housework in countries with more institutional support for work‒family reconciliation and less traditional gender norms.
Contribution: By combining conventional economic and gender-based approaches with an institutional framework, this study contributes to the research field by showing that the division of housework within different couple-earner types is contextually embedded

In and of itself, I'm rather happy with this abstract. I think there is a strong contribution here: the interaction between social institutions and demographic behaviour is clearly important and certainly underexplored. What I have a quibble with is the final modelling set up: the variable of interest here is the division of housework with a gender norm and work-family policy index as independent variables (plus controls). I am totally unconvinced by the idea that work-family policy is a control variable here: to me both household work and and work-family policy are endogenous outcomes of gender norms.

Example: Divorce reform in the United Kingdom
To take a further example, we'll examine marital dissolution in the United Kingdom. As a brief introduction, prior to 1969 in the United Kingdom, there was a rather onerous burden if couples wanted to divorce: proof was required of grounds for divorce, including adultery, drunkenness, insanity and cruelty. The 1969 divorce reform act changed this: couples could now divorce after 2 years separation with no 'fault.' Here's what happened to divorce rates:



Fair enough: here was can see that there is a (sharp) uptick in the number of divorces following the Divorce Reform Act: a reform to a legal institution resulted in a change in demographic behaviour.

The problem here is the extent to which we regard the change in the divorce law as exogenous to the actual behaviours that were occurring in the UK. We should note that the Royal Commission on Marriage and Divorce (1951-55) recommended against any reform of divorce law, arguing from moral principles that family life should be protected

The large number of marriages which each year are ending in the divorce court is a matter of grave concern...This disturbing situation is attributable to a variety of factors ...In the first place, marriages today are at risk to a greater extent than formerly. The complexity of modern life multiplies the potential causes of disagreement and the possibilities of friction between husband and wife...It must also be recognised that greater demands are now made of marriage, consequent on the spread of education, higher standards of living and the social and economic emancipation of women...Old restraints, such as social penalties on sexual relations outside marriage, have been weakened...[and there is] a tendency to regard the assertion of one's own individuality as a right, and to pursue one's personal satisfaction, reckless of the consequences to others...There is a tendency to take the duties and responsibilities of marriage less seriously than formerly

Royal Commission on Marriage and Divorce, 1956: 7-8 

We can see here however, recognition that behaviours were already changing: from our figure above the number of divorces is already increasing before divorce reform. Given then that laws are formed partially to reflect the cultural mores of the societies that they are governing (they are partially inclusive, rather than absolutist as per the Commission's recommendations) the Divorce Reform Act is an outcome of social shift (by the late 60s even the Church had recognised that fault was no a basis on which to grant secular divorce). This is also reflected in demographic behaviours- the legal institution and the behaviour it governs are endogenous. To that extent, this implies that any demographic article which is modelling policy and associated demographic behaviours needs to recognise this endogeneity: legal and policy institutions are as much an outcome variable as divorce rates. Let's model them that way. 

Friday 10 March 2017

How bad demography gave me period pains

Recently, neurobiologist/personality scientist Adam Perkins published a book entitled The Welfare Trait where he claimed that employment resistant personality types are encouraged to breed by the current structure of the (UK) welfare state. Unsurprisingly, this claim was somewhat controversial, with criticism of a somewhat technical nature coming some media and some puff pieces of how the piece was being censored by 'the establishment' from the literal son of a baron.

Perkins' theory
The effects of this [1999 UK increase in child benefit] change to UK welfare provisioning have been studied in detail by Brewer et al. (2011) revealing that reproduction is more sensitive to changes in welfare legislation than had ever previously been shown: not only did these increases in the generosity of per-child welfare in the UK in 1999 increase the number of children born to benefit recipients by approximately 15% [emphasis mine] but also this effect was nuanced according to the specific opportunity-costs circumstances of the individual women.
The Welfare Trait, Chapter 4

As the quote above highlights, Perkins argues essentially that the effect of welfare is to increase the number of children born to families with employment resistant personality types, thus selecting children into this environment and increasing the prevalence of this trait within the population. Consider the following Lexis diagram which takes a hypothetical woman across her reproductive life course, with filled dots representing childbearing. In the absence of policy, she has two children as shown below. 


The paper by Brewer does indeed find that there is an increase in childbearing in response the introduction of more generous child benefits where childbearing is operationalised as having had a child within the previous twelve months. This leads Perkins to make the claim that the increased welfare does something like this, and we increase the number of children across the lifetime due to the additional birth (empty dot).


Why is this bad demography?
Perkins' mistake here is that he is using period (i.e. point or interval in time) evidence to make claims about cohort (or lifetime) effects. It should be stressed that there is a lot of evidence that increased welfare provision, particularly that targeted at children, does increase annual/period childbearing rates (e.g. Parr and Guest 2011). Additionally, there is some evidence that the economically disadvantaged (this is what Perkins means by "specific opportunity-costs circumstances of the individual women") are likely to make greater family orientated transitions (particularly females), that this is relatively independent of their economic outlook (as per Perkins' idea of employment resistance, or at least ambivalence) and that their fertility behaviour tends to depend on the welfare regime in which they are living. 


However, this is not sufficient evidence that this will cause the total number of children that a woman bears across her lifetime to increase: it is perfectly possible that a woman will bring forward her planned childbearing (grey arrow in the figure above) to take of advantage of welfare payments which may later be withdrawn, and have the same number of children across her lifetime as the counterfactual of no welfare reform. Indeed, most studies that look at completed childbearing tend to find that the effects are strongest in terms of timing (Bjorklund 2006) leading to a compression of birth intervals rather than increasing parity progression, with the effect of cash policies on long run childbearing being rather small (Gauthier and Hatzius 1996). Indeed, one of the only papers I am aware of that claims anything like a substantial effect in cohort fertility refers to Glastnost era Soviet Russia. I am unsure that this is a solid basis for making inferences to the 21st century UK.

Monday 6 March 2017

Do teaching personalities vary? Reflecting the difference in student learning types onto teaching styles (Reblog from LSHTM PGCILT)

I recently delivered some taught materials as part of an ongoing favour. I’m not going to disclose the institution or course for privacy reasons, but for some context: I helped deliver this course while I was a postdoc at this institution, and have written and delivered materials for this course. On this occasion however, I was delivering a lecture and workshop on a slightly different topic and was sent some materials that someone else had prepared. Now, these materials I have to say were great- they took a reasonably technical topic and presented it clearly so people working at a variety of levels could access the materials. As well as the technical lectures there was an applied session where the students could implement the models being described on a real life dataset to address research questions. In terms of synthesising learning styles, this was top notch.

The problem however was that I found this material exceptionally difficult to deliver. Now, as a reasonably diligent demonstrator I had of course been through the materials beforehand, and knew what I was covering. I will again stress that I found the materials excellent- from the perspective from which I was receiving them for the first time: effectively as a student.


However, the experiences of the student and the teacher are fundamentally different. So while I had a decent understanding of the student experience, I didn’t really have an overview of the teacher’s perspective. I also fully take on board that there was something to take on board in terms of my teaching delivery- I’m rather more linear in terms of my delivery, which these materials eschewed in favour of breaking up the theoretical and practical elements so there is something to be learned here- but the fundamental point I feel is that in terms of teaching we need to understand this: not only are there different learning styles that we need to take on board when designing materials but there are differing teaching style as well, which we should accommodate for both others who might use our materials (think TAs here) but also for our own purposes. 
Welcome!

Hello, and welcome to my blog. This is partly inspired by activity I'm undertaking as part of my teaching training at LSHTM (these posts will be reblogged periodically), but also catering for my wider interests in demography and social statistics. I will be discussing these as thoughts come to mind- do note that quite often these thoughts are works in progress, so should not be taken as anything definitive.

The blog should cater to people in a few audiences

1) Those interested in demography and associated fields in quantitative social science.
2) Those with an interest in the application of statistical methods to social science problems (I'm not a statistician, I'm a user of statistics. There will not be a lot of formal mathematics here)
3) Those with a tolerance for mixing classical languages in blog names

Comments and dialogue are welcome. Hope you enjoy.