Theory Of Estimation Problems And Solutions

0 views

Skip to first unread message

Saran Bascas

unread,

Aug 3, 2024, 10:58:32 AM8/3/24

to bandcypacco

In physics or engineering education, a Fermi problem (or Fermi quiz, Fermi question, Fermi estimate), also known as a order-of-magnitude problem (or order-of-magnitude estimate, order estimation), is an estimation problem designed to teach dimensional analysis or approximation of extreme scientific calculations. Fermi problems are usually back-of-the-envelope calculations. The estimation technique is named after physicist Enrico Fermi as he was known for his ability to make good approximate calculations with little or no actual data. Fermi problems typically involve making justified guesses about quantities and their variance or lower and upper bounds. In some cases, order-of-magnitude estimates can also be derived using dimensional analysis.

An example is Enrico Fermi's estimate of the strength of the atomic bomb that detonated at the Trinity test, based on the distance traveled by pieces of paper he dropped from his hand during the blast. Fermi's estimate of 10 kilotons of TNT was well within an order of magnitude of the now-accepted value of 21 kilotons.[1][2][3]

Possibly the most famous Fermi Question is the Drake equation, which seeks to estimate the number of intelligent civilizations in the galaxy. The basic question of why, if there were a significant number of such civilizations, human civilization has never encountered any others is called the Fermi paradox.[6]

Scientists often look for Fermi estimates of the answer to a problem before turning to more sophisticated methods to calculate a precise answer. This provides a useful check on the results. While the estimate is almost certainly incorrect, it is also a simple calculation that allows for easy error checking, and to find faulty assumptions if the figure produced is far beyond what we might reasonably expect. By contrast, precise calculations can be extremely complex but with the expectation that the answer they produce is correct. The far larger number of factors and operations involved can obscure a very significant error, either in mathematical process or in the assumptions the equation is based on, but the result may still be assumed to be right because it has been derived from a precise formula that is expected to yield good results. Without a reasonable frame of reference to work from it is seldom clear if a result is acceptably precise or is many degrees of magnitude (tens or hundreds of times) too big or too small. The Fermi estimation gives a quick, simple way to obtain this frame of reference for what might reasonably be expected to be the answer.

Fermi estimates are also useful in approaching problems where the optimal choice of calculation method depends on the expected size of the answer. For instance, a Fermi estimate might indicate whether the internal stresses of a structure are low enough that it can be accurately described by linear elasticity; or if the estimate already bears significant relationship in scale relative to some other value, for example, if a structure will be over-engineered to withstand loads several times greater than the estimate.[citation needed]

Although Fermi calculations are often not accurate, as there may be many problems with their assumptions, this sort of analysis does inform one what to look for to get a better answer. For the above example, one might try to find a better estimate of the number of pianos tuned by a piano tuner in a typical day, or look up an accurate number for the population of Chicago. It also gives a rough estimate that may be good enough for some purposes: if a person wants to start a store in Chicago that sells piano tuning equipment, and calculates that they need 10,000 potential customers to stay in business, they can reasonably assume that the above estimate is far enough below 10,000 that they should consider a different business plan (and, with a little more work, they could compute a rough upper bound on the number of piano tuners by considering the most extreme reasonable values that could appear in each of their assumptions).

Fermi estimates generally work because the estimations of the individual terms are often close to correct, and overestimates and underestimates help cancel each other out. That is, if there is no consistent bias, a Fermi calculation that involves the multiplication of several estimated factors (such as the number of piano tuners in Chicago) will probably be more accurate than might be first supposed.

In detail, multiplying estimates corresponds to adding their logarithms; thus one obtains a sort of Wiener process or random walk on the logarithmic scale, which diffuses as n \displaystyle \sqrt n (in number of terms n). In discrete terms, the number of overestimates minus underestimates will have a binomial distribution. In continuous terms, if one makes a Fermi estimate of n steps, with standard deviation σ units on the log scale from the actual value, then the overall estimate will have standard deviation σ n \displaystyle \sigma \sqrt n , since the standard deviation of a sum scales as n \displaystyle \sqrt n in the number of summands.

There are or have been a number of university-level courses devoted to estimation and the solution of Fermi problems. The materials for these courses are a good source for additional Fermi problem examples and material about solution strategies:

This monograph is devoted to the construction of optimal estimates of values of linear functionals on solutions to Cauchy and two-point boundary value problems for systems of linear first-order ordinary differential equations, from indirect observations which are linear transformations of the same solutions perturbed by additive random noises. It is assumed that right-hand sides of equations and boundary data as well as statistical characteristics of random noises in observations are not known and belong to certain given sets in corresponding functional spaces. This leads to the necessity of introducing the minimax statement of an estimation problem when optimal estimates are defined as linear, with respect to observations, estimates for which the maximum of mean square error of estimation taken over the above-mentioned sets attains minimal value. Such estimates are called minimax or guaranteed estimates. It is established that these estimates are expressed explicitly via solutions to some uniquely solvable linear systems of ordinary differential equations of the special type. The authors apply these results for obtaining the optimal estimates of solutions from indirect noisy observations.

Similar estimation problems for solutions of boundary value problems for linear differential equations of order n with general boundary conditions are considered. The authors also elaborate guaranteed estimation methods under incomplete data of unknown right-hand sides of equations and boundary data and obtain representations for the corresponding guaranteed estimates. In all the cases estimation errors are determined.

This article is based on the series of lectures given by Prof. Charles Stein of Stanford University at LOMI AN SSSR in the fall of 1976. The first three lectures are concerned with the estimation of the mean vector of a multivariate normal distribution with quadratic loss function. James-Stein estimators are considered and their relation to Bayesian estimators is discussed. The problem of estimating the covariance matrix of the normal distribution and the estimation of the entropy of a multinomial distribution are considered in the following two lectures. The final lecture discusses several problems related to the estimation of multivariate parameters and poses some unsolved problems.

Locating seismic sources with heterogeneous networks of sensors remains a significant challenge in earthquake sciences, geophysical monitoring efforts, and other fields rooted in seismological observation. Traditional techniques that address the association problem grid the Earth at spatially variable resolutions and implement a complicated set of rules to build events (associate them) from waveform arrivals. Some recent associators have used Bayesian methods or Machine Learning (ML) techniques. These advancements, however, are most effective when faced with similar events with semi-predictable locations, either by the inclusion of prior information (Bayesian) or implicitly through training data set selection (ML). Our work investigates reformulating the association problem using graph theory. We define a series of nodes and edges and compute the maximum `cliques' of arrivals that define hypocenter locations. Estimating the maximum clique of a graph is one of Karp's (1972) list of NP-complete problems. Solutions using small data sets (with several 10s of nodes) can be estimated using a suite of approximations or exhaustive search algorithms. These techniques quickly become unfeasible for large-scale implementation and will benefit from developing technologies in quantum computing. This work attempts to demonstrate that finding the maximum clique of a graph of arrival nodes and their connecting edges is an effective solution to hypocenter identification. This new framework has applicability to an assortment of geophysical monitoring problems, including improving earthquake hazard estimates and enhancing fossil energy extraction.

Abstract: Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation. However, a folded concave penalization problem usually has multiple local solutions and the oracle property is established only for one of the unknown local solutions. A challenging fundamental issue still remains that it is not clear whether the local optimum computed by a given optimization algorithm possesses those nice theoretical properties.

To close this important theoretical gap in over a decade, we provide a unified theory to show explicitly how to obtain the oracle solution via the local linear approximation algorithm. For an estimation problem formulated via a folded concave penalized convex loss, we show that as long as the problem is localizable and the oracle estimator is well behaved, the two step LLA estimator converges to the oracle estimator with overwhelming probability. The general theory is demonstrated by using four classical sparse estimation problems, that is, sparse linear regression, sparse logistic regression, sparse precision matrix estimation and sparse quantile regression