Inan earthquake catalog, the magnitude of completeness (Mc) is the minimum magnitude above which all earthquakes within a certain region are reliably recorded.[1] For example, if the Mc of a catalog for a specific region is 2.6 from 1980 to the present, this means that all earthquakes above a magnitude 2.6 have been recorded in the catalog from 1980 to the present time. When interpreting this data, a Mc too high may mean under-sampling, whereas a value too low could indicate an erroneous seismicity parameter.[2]
Bengoubou-Valrius and Gibert (2013) carried out a comparative analysis of these methods which led them to the conclusion that the maximum likelihood method is most robust. With the correction for magnitude binning (Utsu, 1966), the maximum likelihood estimate of the b-value has the following form:
Underestimated \(M_c\) values lead to incorrect estimates of the seismicity parameters and to a misleading interpretation of the data. For example, the inclusion of incompletely detected events in the b-value estimation can substantially underrate this estimate (Fig. 2b) which, in turn, entails underestimation of the activity rates of small earthquakes and overestimation of the activity rates of large earthquakes. The overestimation of the \(M_c\) value may not be so critical (Chen et al., 2011) but it reduces the size of the analyzed sample and, thus, increases the uncertainty of all the further analysis results. The \(M_c\) value varies in space and time, and these variations should be taken into account when working with earthquake catalogs for estimating various parameters of seismicity.
Later, Rydelek and Sacks (1989) proposed a method for estimating \(M_c\) based on the comparison of the number of earthquakes recorded during the day and at night. This method is intended for analyzing local catalogs. It is assumed that the probability of detecting an earthquake increases at night because of the decrease in the level of industrial noise. Thus, if the predominant number of earthquakes in a given magnitude interval is recorded at night, the catalog is considered incomplete in this interval.
The results of these two approaches were compared, in particular, in (Smirnov and Gabsatarova, 2000) where the authors used the earthquake catalog of the North Caucasus to show that the estimates of the minimum completely reported energy class of an earthquake obtained by these two approaches are fairly consistent with each other and adequately reflect the changes in the seismic network.
The methods pertaining to the first group are less cumbersome, require less information, and are more frequently used in practice. Besides, information required for using the second group of methods is not always accessible for the researchers and, thus, these methods are not always applicable. Therefore, in this work, we only consider the first group of the methods. Because of the existing diversity, the very choice of the preferable method for estimating \(M_c\) ceases to be trivial.
In this work, we consider six modern methods of the \(M_c\) estimation and analyze the behavior of the estimates obtained by these methods on synthetic earthquake catalogs depending on the sample size and on the shape of the initial magnitude distribution, which was used for synthesizing the catalog. Variations of the \(M_c\) estimates are analyzed using bootstrap method.
\(M_c\) estimate at the point of maximum curvature of the cumulative earthquake FMD (Wiemer and Wyss, 2000). This is the simplest and fastest procedure of \(M_c\) estimation. In practice, the point of the maximum curvature of a cumulative FMD corresponds to the magnitude that accounts for the maximum number of events in the sample. Although simple and reliable, this procedure tends to underestimate \(M_c\), especially in the cases when the incremental FMD does not have a clearly pronounced maximum. Presumably, this shape of the incremental FMD, namely, the curvature of its incomplete part, can indicate that the analyzed sample is heterogeneous (Mignan, 2012). Nevertheless, this estimate for homogeneous samples will be adequate (Mignan et al., 2011). We denote this procedure by MAXC (MAXimum Curvature) and the corresponding estimate of the magnitude of completeness by \(M_c^\textMAXC\). The example of this estimate is shown in Fig. 1.
Subsequently, the method was modified so that the model included \(M_c\) in the explicit form (Woessner and Wiemer, 2005). These authors proposed a combined model in which the normal distribution describes the probability of detecting the events pertaining to the catalog part with incompletely reported magnitudes:
Bootstrapping (Efron and Tibshirani, 1993) is a nonparametric method for studying distributions of the sought parameters, which is based on the multiple Monte-Carlo extraction of repeated samples from a given sample. The method allows for fast and simple estimation of various statistics (e.g., variance, confidence intervals) for complex models. The idea of the bootstrapping technique is that from a given sample, some set \(n_b\) of the samples of a given size are drawn by random selection with replacement. On the set of repeated samples, the sought parameters are estimated and, based on the obtained empirical distributions, all the required statistics are determined.
All the described procedures were implemented as the MATLAB programs by V.A. Pavlenko. To test whether the procedures work correctly, an attempt was made to reproduce the results obtained in (Woessner and Wiemer, 2005; Amorse, 2007). For this purpose, we used the same samples from earthquake catalogs as those samples used by the authors of the cited papers to illustrate their analyses.
In these results, some characteristic features of the methods under study show up: the MAXC and GFT methods typically yield the lowest \(M_c\) estimates; the MBS estimates are usually the most conservative. The estimates obtained by the other methods are intermediate between the conservative and optimistic estimates.
From Table 1 it can be seen that for all methods but MBS, the variances of the \(M_c\) estimates calculated from the samples of NCSN and CMT catalogs proved to be minimal or close to minimal. The distribution of the sample from NCSN catalog is close to the characteristic angular shape with a pronounced maximum, which is peculiar to the samples with uniform detection rate. The distribution of the sample from CMT catalog has a flat maximum; the scatter of the \(M_c\) estimates for this sample is noticeably higher than for the sample from the NCSN catalog.
The distributions of samples from the ECOS and NIED catalogs have smoothed shape with a noticeably curved incomplete part of the incremental FMD which may indicate that the sample data are heterogeneous. Overall, this leads to the higher variances of the \(M_c\) estimates and to the more significant discrepancies of the \(M_c\) estimates obtained by the different methods.
Our results demonstrate the sensitivity of the methods of \(M_c\) estimation to the shape of the distribution of the analyzed sample. This effect is considered in more detail on the synthetic earthquake catalogs in the next sections.
To create a synthetic earthquake catalog, one should use the magnitude distribution model that can describe both the complete and incomplete parts of the catalog. Besides the OK and WW models noted above, there are yet another two distribution models suitable for this purpose.
This shape of the distribution is characteristic of the early period of the instrumental observations when weak events were missed because of the relatively low sensitivity of the instruments. Both AN and POL models obey Eq. (9).
As is known, an estimate is called consistent if it converges, in terms of probability, to the parameter under estimation. The consistent estimate is asymptotically unbiased and its variance decreases with the increase in the sample size. Thus, from the \(M_c\) estimates it was expected that with the increase in the sample size, the mean values will approach the true \(M_c\) and that the variances will decrease.
The \(M_c^\textMBASS\) estimates, on average, correspond to the \(M_c^\textMAXC\)estimates but have a higher variance. For all three AN catalogs, the EMR method overestimates \(M_c\) on the small samples; however, with the increase in \(N\), the \(M_c^\textEMR\)estimate gradually converges to the true \(M_c\) value. The results weakly depend on parameter \(k\).
The obtained results show that all the described methods are quite efficient in determining \(M_c\) for the AN model which has abrupt junction between the incomplete and complete parts of the sample. The results for the WW model with a smoothed distribution maximum are somewhat worse. Finally, in the case of the POL model in which this transition is barely expressed, all the described methods fail to correctly determine \(M_c\).
On the one hand, this result suggests that the analysis of completeness of the data pertaining to the beginning of instrumental catalogs should be approached with particular scrutiny. On the other hand, the observations of (Garca-Hernndez et al., 2019) indicating the POL model quite rarely occurs in the real catalogs gives hope that this distribution will not be encountered in practice.
The MAXC method proves to be indispensable in analyzing the completeness of samples from small spatiotemporal volumes which characteristically have the shape of the AN distribution. It is the only method that can be used to analyze the completeness of the extremely small samples with \(N \geqslant 4\) (Mignan et al., 2011). This method was used for high-resolution mapping of the spatial variations in \(M_c\) in Taiwan (Mignan et al., 2011). For the larger samples containing hundreds or more events where the maximum of magnitude distribution is typically flat, the MAXC method underestimates \(M_c\) and it is not recommended for use.
3a8082e126