January 1, 2022

This is just a reminder to me. There are so many steps to keeping a model like this up-to-date. And no real way of automating the process, sadly.

This tries to bring together everything in the convoluted steps needed to add a year’s worth of data to the model (FRS/SHS), uprate the variables and create a new weighting target dataset.

This is convoluted. I don’t remember Taxben being this hard. Ideally I’d automate much of this - I had a brief go using some of the APIs for grabbing ONS data, but didn’t get far.

Also, paths and filenames are often hard-wired in: add a paths config file.

Always keep running the test suite while you’re doing any of this.

The struct UpdatingInfo in Definitions.jl holds static info on when each component below was last updated and should be kept up-to-date.

File paths are held as constants in Definitions.jl.

Raw survey data from UK Data Service (login in keypass). Unpack these into the $RAW_DATA/[dataset]/[year] directories and add simlinks to tab and mrdoc directories.

General usage.

In what follows I assume using the repl plus Revise. Starting julia in the ScottishTaxBenefitModel home directory.

1) load a script file:

] activate .
using Revise
includet( "src/[yourfiles])

2) load tests:

] activate .
] test

3) load specific test:

] activate .
using Revise, ScottishTaxBenefitModel
includet( "test/testutils.jl")
includet( "test/[your specific test"])


Code is HouseholdMappingFRS_HBAI.jl (note: not a package). Check the .tab files and .docs carefully:

Note that we use HBAI for optional SPI’ wage and self-employment data so we can only add a year when the HBAI is released.

Note paths wired in to Definitions.jl.

Then, run create_data(). This creates a full UK-wide dataset. Run scripts/create_scottish_subset with ADD_IN_MATCHING set to false (initially) to create just scottish bit. ADD_IN_MATCHING needs to be false until step (2) below.

2. Matching in a new SHS

Unpack new SHS as above. The matching code is an unholy mess.

In matching/:

3. Creating a target weighting dataset

Directory (for 2022) data/targets/aug-2022-updates/; create something similar for each year. Main workfile is target_generation.ods which attempts to get counts of people, households, employment, etc. consistent.

Output is (for 2022) at 90 piece target set. Sources:

All this has to be merged together manually on any update, I’m afraid. Note how we change the standard age ranges 10-14 and 15-19 to 10-15 16-19 to better mesh with employment data. Note how everything needs to be scaled to match 2022 hhld/population numbers (popn is all or hhld depending on the question - see the spreadsheet).

4. Uprating

Main uprating file is data/prices/indexes/indexes.tab. Uprating code is Uprating.jl; filenames and uprating targets in Settings.jl. Sources are as in indexes.tab header rows. Indexes are quarterly. Sources:

FIXME this needs updating urgently.

5. Benefits

There are 3 things here: numbers for the transition to UC, estimates of how many on legacy disability benefits we should move to new benefits and some probits we use to model generosity of disability tests.

5.1 The Legacy/UC transition

This is done very, very crudely using House of Commons Data. We use Scotland-wide approximations, which are then hard-wired into UCTransition.jl. We could use LA level if someone still produced this (HoC is constituency). Can’t be bothered trying myself.

5.2 Model Transitions to new disable/carer benefits

Code is HistoricBenefits.jl. It re-assigns DLA recipients to PIP according to proportions on each in the interview month for Scotland as a whole.

Data files are:

To update these, randomly press buttons on STat Explore until something comes out - DLA/PIP in receipt, including devolved to Scotland, current tables. Note I have a saved table format for PIP. Export as .xlsx. Transpose in open office to same format as data/receipts/pip_2002-2020_from_stat_explore.csv. Change filename in HistoricBenefits.jl.

You also need to update params/historic_benefits.csv; see section on updating parameters below.

5.3 Benefit Generosity

Main script is regressions/disability_regressions.jl

Creates candidates files in data/disability/

If the data has been created correctly, just running the script should create these files automatically. A data year dummy for the new year’s data should be automatically added.

6. Adding new default parameters

Most of the individual level tests are based on the system when I started, using the 2020/21 values hard-wired into the parameter definitions using the @with_kw Macro. So, don’t alter the defaults there. Instead, copy

sys_2022-23.jl and update that. This can be loaded using load_file in STBParameters. If thing are changing rapidly you can add or remove some parameters in a separate file and and layer that on top of the main parameters using the mutating load_file! function.

6.1 Direct Taxes

Note that it’s best to get an updated version of Melville’s Taxation for a consolidated set of parameters and test examples.

But use Mellvile.

6.2 UK Benefits

Only place I know with everything in one place is the CPAG Guide.


6.3 Scottish Benefits

See here. Notes:

6.4 Local Housing Allowances

2022/3 values are here.

Note the BRMA definitions are treated as constants.

” for this year (2022-23) all rates have been frozen at the rate last determined on 31st March 2020. This was the 30th percentile at that time.”

So I’ll skip changing this for now.

6.5 Council Tax


This needs parameterised better.

Default alues are hard-wired into default_band_ds function in STBParameters.

Example loading new values in sys_2022-23.jl, at the bottom. Values from ScotGov CT Datasets. We just need the band Ds here so long as the relativities don’t change.

7. Updating Tests

7.1 Individual Level Unit Tests

7.2 Tests in Aggregate - sources

8 Notes on data sources


Category: Blog Tags: Programming
Updating - January 1, 2022 - Graham Stark