**By Peter Gorham, FCIA**

About 10 years ago I received a call from a lawyer looking for help in determining the expected number of members in a class action. It was a historical action dating back to the 1940s. My mandate was to determine the number of Indigenous children who attended one of several schools from the 1940s to the 1990s. That number, then, was to be split between surviving and deceased students as of 2010. A simple calculation: determine the number of graduating students from each school in each year and apply survivorship probabilities.

But there were three particularly interesting and challenging aspects to the mandate that made it anything but simple.

- The data about students only existed for about 40% of the school years. And some of that looked suspicious.
- What data existed only showed the total number of students during the year. There was no information about entering and leaving students.
- Historical survival statistics for Indigenous Peoples for the last 80 years were needed.

Since then, I have been retained on many other historic class actions. Many of them date back to the early 1900s. In each case, there were data gaps – some worse than others. The data that did exist could not be directly used and required adjustments before the actual number of people involved could be determined. And survivorship statistics appropriate for the class members had to be developed.

My first reaction was that the data was too poor to be reliable and it would be foolish to accept the mandate. I was essentially being asked to make a black forest cake out of a cherry pit and some chocolate crumbs. To do nothing would leave both plaintiffs’ and the defendants’ counsel with little idea of the expected class size. That would significantly complicate reaching a settlement – unless the defendants were willing to issue a blank cheque. I figured any answer, properly qualified, was better than no answer.

### Data gaps

I tried various techniques to fill in the data gaps but ultimately concluded that there was nothing better than applying actuarial art – estimating the missing data by hand. Consider the following sequence of total enrolment at a school:

Year | Enrolment | Year | Enrolment |

1950 | 48 | 1955 | – |

1951 | 50 | 1956 | – |

1952 | 54 | 1957 | – |

1953 | 60 | 1958 | 62 |

1954 | – | 1959 | 60 |

Most techniques I tried would fill in those blanks by increasing the enrolment to a peak of between 75 and 80 before dropping back to 62 students in 1958. But for a school, that would have meant either temporary accommodations, sufficient unused space, or heavy overcrowding. That might have been appropriate once or twice, but there were many gaps that looked similar. Classroom space is not very elastic. I concluded it was much more likely that the enrolment had peaked at about 62 students and, from eyeballing the data, filled in the missing years accordingly.

There were a number of data gaps where the missing information was not as obvious as in the above example. Did the enrolment increase, decrease, or remain level? For these situations, I considered three sets of adjustments – low, medium, and high numbers. While I produced estimates using each of those adjustments, I assumed that it was unlikely all the adjustments would be the low values or all the high values and it was most likely they would average out. But this did help me in estimating a range for the final estimate.

### Double counting

Having developed a data set of total enrolment for each year and each school, I was now faced with determining the number of unique students attending school. There was no information on the average duration of attendance. (Based on my personal experience at school, I would have guessed at about 11 years, plus or minus two years. I also reviewed the Common Experience Payments under the Indian Residential Schools settlement and determined the average attendance for which compensation was paid was about four to five years. That helped for the school enrolment matters but was of no value to other actions where total data had to be converted to unique individuals.)

I built a model that would follow individuals through their time at school. Each school was modelled individually, since there was a lot of detailed data that got “lost” when amalgamated. I set an assumption for the average number of years in attendance, so I could determine when each student entered and left. But the total enrolment data for each school also gave clues: when total enrolment increased, there likely was an increase in new enrolments; when total enrolment decreased, a larger number of students leaving.

With the model, I could try various assumptions for the average length of attendance – from two to 12 years. Some of those values produced nonsensical results. Like negative enrolments or negative numbers leaving. A few of the values produced reasonable results. I went with them.^{1}

Building these models has been particularly challenging and fascinating. The risk of error is high. Developing something useful out of the bits of information available is very rewarding.

### Survivorship

Most of these class action matters have involved Indigenous people and a few were non-Indigenous people. It was clear that population mortality is most appropriate. And given the span of time involved, historic mortality was required. I considered two sources – the Human Mortality Database at McGill University and the Canada Life Tables published by Statistics Canada. The Human Mortality Database is the easiest place to obtain decades of information on Canada and many other nations’ populations and mortality. But the courts and lawyers are familiar with actuaries using the Canada Life Tables. After much searching, I was able to find mortality data from Statistics Canada dating back to 1831, so I went with that.

From a materiality perspective, going back before 1940 was not necessary, but from a general appearance and acceptance by non-actuaries, going back to at least 1910 was appropriate. From these period tables, I constructed a series of cohort mortality tables for each year of birth from 1900 to 2020. I included mortality projections using the Canadian Pensioner Mortality Scale B (that turned out to be useful, since, in a few matters, I was asked to estimate the number of expected deaths over the next five or 10 years).

Indigenous Peoples’ mortality is much greater than for all Canadians. There are a few studies that estimated life expectancy or mortality multiples for First Nation, Métis, and Inuit but those estimates covered only recent periods^{2} – and reached different conclusions. I am shocked at the mortality multiples. Overall, First Nations people have a life expectancy of 71 compared with 79 for all Canadians (in the 1970s, the difference in life expectancy was 10 to 11 years).^{3} Mortality multiples to be applied to the Canada Life Tables at ages 1 to 65 vary from 150% to 360% for males and from 150% to 430% for females.

The result is that I now have a set of cohort mortality tables for First Nation and Inuit peoples from 1900 to 2022. And the data to extend that back to 1831 should it ever be required!^{4}

### Conclusion

I developed a range for the estimate of class size by another application of actuarial art. Because of the data gaps and other quality issues I encountered, I was not comfortable using any statistical analysis in setting a range. I also figured the resulting range would be something like zero to a million – much too big to be of any value. I modified all the assumptions and created a sensitivity table of results. Eyeballing those led me to select two ranges – a large range that I estimated would be sufficient in almost any event and a smaller range that I expected would be more likely.

But lawyers want a single number. If I did not give them one, they were likely to pick something in the middle. So, I ran the model with the best estimate assumptions I had and used that result. It was not always in the middle of the ranges I provided. I was pretty sure the lawyers would focus on the single-point estimate. Therefore, I was careful to phrase the results so that they would know to draft a compensation plan that contemplated greater or fewer class members than the single-point estimate as well as total class membership that was outside the ranges I provided.

“What about a provision for adverse deviations (PfAD)?” I hear you scream. In most of these cases, it is not clear to me whether a PfAD would require increasing or decreasing the estimate. If the estimate is wrong, who benefits, the plaintiffs or the defendants? To avoid any unintended bias, I never asked. I tried to present a best estimate together with a range that would likely encompass the actual class size. Basically, the range is my PfAD.

The proof of the work can take a few years to play out. Only three of the cases have reached the deadline for claims to be filed. In all three, the number of claimants has been within the smaller of the two ranges provided. My fingers are crossed^{5} that will be the situation going forward.

I would like to acknowledge the National Day for Truth and Reconciliation as well as Orange Shirt Day, coming up on September 30. The day honours the children who attended a residential school, those who never returned home and those who survived, their families and communities. The orange shirt is a symbol of the stripping away of culture, freedom, and self-esteem experienced by Indigenous children over generations. All Canadians are encouraged to wear orange to honour the survivors of residential schools. Every Child Matters.

*This article reflects the opinion of the author and does not represent an official statement of the CIA.*

[1] Just so I do not leave you wondering, for the school mandate, the model ended by suggesting to me that about four to seven years of average attendance was reasonable.

[2] For my purposes, recent means the past 30 to 50 years.

[3] Inuit life expectancy is even lower at about 55 to 65 years.

[4] However, I have no confidence in the multiples for Indigenous Peoples prior to about 1960 due to the absence of relevant data. For the sake of the work I have been doing, I have tested various multiples and determined that pre-1960 multiples are not material to the results.

[5] A potentially underused actuarial technique.

Great techniques – sometimes our job is more art than science! Well done and good luck – hopefully, your resulting ranges also work for future claims.