Adding flexibility to impact evaluations with synthetic controls
Sections
Difference-in-differences (DID) is one of the leading quasi experimental methods used in policy and program evaluations. In DID, the effect of an intervention is estimated using observational data from one or more ‘treatment’ areas and one or more matched ‘control’ areas. Changes in the chosen outcome variable are measured over time, from pre- to post-intervention. Differences in the magnitude of the changes between treatment and control areas are used to estimate the intervention’s effect.
Widely embraced in economics, DID has been applied to investigate the impact of a range of diverse policy changes or interventions, including: the effect of changes to the minimum wage on employment levels; the impact of the 1998 Russian Financial crisis on bank lending; and even the influence of Pokémon GO on physical activity.
While a powerful and flexible method, DID has several critical statistical assumptions that must hold to ensure unbiased estimates. One of these is the ‘parallel trend’ assumption: pre-intervention trends in the ‘treatment’ and ‘control’ areas must be the same.
What do I do if the assumptions underpinning difference-in-differences don’t hold?
In practice, pre-intervention trends are often not parallel. For example, imagine an intervention to reduce harmful drinking is introduced in a large regional town in NSW, Australia. To use DID to evaluate the intervention, researchers need to select control areas. They identify several similar sized regional towns in NSW and Victoria with roughly comparable demographics. However, when they analyse baseline alcohol consumption, the researchers notice differences in the consumption trends between the ‘treatment’ and ‘control’ towns (perhaps due to local economic or social factors). These differences mean that DID is not a suitable method to accurately estimate the effect of the intervention.
In such instances, synthetic control often emerges as a useful alternative. Introduced by Alberto Abadie and colleagues in the early 2000s, the synthetic control method aims to correct for baseline differences by creating a weighted average of more than one ‘control’ area. This weighted average – the synthetic ‘control group’ – is more likely to be closely aligned with historic trends in the ‘treatment’ area.
How are synthetic controls used?
Synthetic control was first applied to estimate the economic ramifications of terrorism in Basque Country, Spain by comparing the regional GDP of the Basque region against a synthetic control that did not experience terrorism. Since this study, synthetic control methods have been used to estimate the impact of position 99 - a tobacco control program in California; the effects of reunification on the West Germany economy; and the impact of Hugo Chavez’s leadership on the Venezuelan economy.
To use the synthetic control method, you need data from multiple comparative case studies (i.e., at least one ‘treatment’ area and many ‘control’ areas). The datasets for these should ideally have a high number of observations spanning pre- and post-intervention periods. Synthetic control is therefore best suited to the evaluation of interventions for which there is robust data infrastructure, and inappropriate for evaluations where collection of data is challenging.
Recently, Arkhangelsky and colleagues combined synthetic control with DID to create a promising new method: ‘synthetic difference-in-differences’. This estimator, which combines the strengths of each of the methods it draws from, appears to perform effectively across a variety of different potential evaluation scenarios.
This article was issued under our former global brand name: Kantar Public.
Senior Director- APAC Head of Behavioural and Communications
Our latest thinking
Subscribe to receive regular updates on our latest thinking and research across the public policy agenda.