Injuries represent an important cause of morbidity and mortality worldwide. In retrospective epidemiological studies, estimated rates of reported injuries often decline considerably when information is included from periods more than a few months before the data collection. Such low rates are usually regarded as a consequence of memory decay. It is largely unknown whether the extent of memory decay depends on external factors otherwise affecting injury rates.
A statistical model was introduced to separate the influence of external factors on true injury rates from effects on memory decay. The relationship between apparent rates and time elapsed between injury occurrence and data collection was described by a parametric regression model. Relationships between memory decay and external factors were modelled by effect modification of the relationship with time. The procedure was applied to data collected in a retrospective household survey, carried out in Khartoum State in 2010, which elicited information about injuries that had occurred during the last year. The survey included 5661 individuals in 973 households, reporting a total of 481 non-fatal injuries.
In the data from Khartoum State, differences in memory recall were observed between socioeconomic groups, with considerably faster memory decay in the lower socioeconomic tertile. In this tertile the estimated probability that an injury which occurred 6 months ago was reported was only 18%, compared to probabilities of about 35% in the remainder of the population. In the lower socioeconomic tertile, in contrast to other groups, a simple exponential model was not sufficient for describing memory decay. Memory decay did not depend on sex, age, urban/rural status or education. Road traffic injuries were subject to less memory decay than injuries due to falls, mechanical causes and burns. Memory decay seriously affected crude overall injury rates and also to some degree estimated relative rates.
In large parts of the world it is difficult to carry out prospective studies of injury incidence in well-defined populations. Epidemiological investigations of injuries must frequently rely on data collected retrospectively in surveys dealing with incidents in particular time intervals in the past. It is well-known that memory decay affects data collected in this manner when the time intervals extend over more than a few months [2,3,4,5,6,7]. It is largely unknown, however, how memory recall depends on the actual time span between data collection and the relevant period in the past. To a major extent it is also uncertain whether memory decay differs essentially between populations or between groups defined by demographic and social factors [5,6,7].
Most retrospective studies of memory decay have compared apparent injury rates found by subdividing the range for the time between data collection and injury into a certain number of intervals. In several cases rather few and wide intervals have been compared [2, 7, 8]. This may have been necessary because of the structure of the available data, but in principle such procedures represent suboptimal use of information. An alternative approach is to describe memory recall considering a specific mathematical model of the relationship with time between injury and data collection [3, 9,10,11]. Such models can be fitted to the data by general regression techniques to obtain a more detailed description of the memory decay process. Until now, however, these techniques have not incorporated an assessment of the relationships with other factors affecting injury rates.
The present study will explore these issues by modelling the magnitude of memory decay as a function of the amount of time before information is collected, considering retrospective data from a household survey of injuries carried out in Khartoum State, Sudan. The primary objective is to demonstrate how a relatively simple mathematical relationship between memory decay and time can be established by standard epidemiological procedures. The purpose is also to show how such techniques can be used to investigate whether the relationship depends on demographic and social factors or on injury cause. Finally, implications for the overall estimation of injury rates are considered. The data set has previously been used to explore associations between injury rates and potential risk factors in a Sudanese context [12], to study socioeconomic implications of injuries [13] and to examine use of health services by injured people [14].
Information was collected from each household in an interview performed by a specially trained data collector [12]. Female heads of household were identified as main respondents. In Sudan, female heads of household are considered more knowledgeable of events influencing the family, and national surveys usually rely on them as main respondents [14]. If the female head of household was not present, the next eligible adult was interviewed. If, according to the main respondent, injuries had occurred in the household during the last year, each injured individual was also interviewed about particulars of the event. If an injured individual was absent or less than 18 years old, an adult proxy was assigned. Nobody under the age of 18 years was interviewed alone [14].
A particular questionnaire, developed according to the World Health Organization (WHO) guidelines for surveys on injuries and violence [15], was used to elicit details about each injury reported. The general WHO definition of an injury was briefly explained to the respondents. Any injury experienced was recorded, irrespective of medical care given. Few fatal injuries were reported [13] and the present study deals with the 481 non-fatal injuries reported among the 5661 individuals included.
Additional more complex models with coefficients β1, β2, β3 depending on any of the categorical risk factors considered, corresponding to models with effect modification (or interaction), made it possible to test for the potential influence of such factors on memory decay. For the purpose of data exploration, an alternative model to (1) was also considered with a categorical effect of month before interview, not postulating any particular mathematical relationship between memory decay and time. Month 1 was then regarded as the reference category. For a separate comparison of injury rates in month 12 with those in month 1 only, an analogous categorical model was used.
For illustrative purposes, unbiased crude estimates of injury rates were computed on the assumption that observations in the first month before the interview were not subject to memory decay. Similar estimates, presumably involving a certain amount of negative bias, were found considering longer cumulative periods before the interview. Crude relative rates were estimated restricting the computation to the relevant categories. As an alternative approach, the absolute injury rate in month 1 was estimated considering predicted values obtained in models specified by Eq. (1). In these models no random household effect was included, as such effects tended to reduce systematically the magnitude of the predicted rates. The predicted values representing different combinations of risk factors were weighted by the corresponding number of person-years in month 1. These analyses also produced model-based estimates of relative rates.
Considering the relationship with time since injury within tertiles of socioeconomic status, the category with the lower status exhibited a rather different behaviour from the other two (Fig. 2 and Table 4). Memory decay initially occurred at a much faster rate in this group, with substantially lower probabilities that an injury should be reported during the first 6 months. A log-quadratic function was clearly needed within this tertile to provide a reasonable description of the relationship with time since injury. Taking into account the standard errors (SE) associated with the estimates of the quadratic coefficients in the middle and upper socioeconomic tertiles (Table 4), no justification of any quadratic term was essentially found in these groups. The injury rate in month 1 was also much higher in the lower socioeconomic tertile, twice the value found in the middle tertile (Table 4). Despite these major differences between socioeconomic groups, the estimated rates of reported injuries did not differ substantially after about 7 months (Fig. 2).
This paper has shown how a simple parametric statistical model may be used for assessing the effect of relevant factors on memory loss, depending on the time since an injury occurred. In the data from Khartoum State a basic exponential model, corresponding to a constant rate of memory loss [22], did not suffice for describing the overall relationship. Both in the complete data set and in the particular lower socioeconomic tertile, a model involving an exponential function with a quadratic expression in time was needed, although basic exponential models were still adequate in the other tertiles. The general model was also used for exploring memory decay after injuries due to specific causes, suggesting that relationships with time may differ, with a slower memory decay after road traffic injuries.
In some studies of memory decay after injuries, another external data source has been available, providing information about all or nearly all injuries that could potentially be reported [2, 8, 23,24,25]. The statistical analysis can then proceed in a different manner, with direct estimation of reporting probabilities. In studies based on comparison of rates at different times, various mathematical expressions have been introduced to model the relationship between memory decay and time since injury in particular groups [3, 9,10,11, 26]. These models correspond to the last part of our formulation (1), representing the contribution to the rate of reported injuries made by the time variable. The fundamental assumption that injuries reported from the period immediately before data collection reflect the true state was formulated explicitly by Massey and Gonzalez already in 1976 [3], and this assumption is underlying the arguments used in various subsequent papers on memory recall [4, 9]. In some studies the mathematical relationship has been fitted separately to subsets of the data representing particular injury categories [3, 9, 11, 27]. Yet, to our knowledge, no one has previously considered a joint statistical model for the rate of observed injuries, combining one term representing the true injury rate in the first time period depending on overall risk factors and a second term describing memory decay moving further back in time.
aa06259810