I hope that the debate on 14 Mar 2025 will address the elephant in the room: sadly, the Nobel Prize in Physics 2024 for Hopfield & Hinton is a Nobel Prize for plagiarism. They republished methodologies developed in Ukraine and Japan by Ivakhnenko and Amari in the 1960s & 1970s, as well as other techniques, without citing the original papers. Even in later surveys, they didn't credit the original inventors (thus turning what may have been unintentional plagiarism into a deliberate form). None of the important algorithms for modern Artificial Intelligence were created by Hopfield & Hinton. See details in the recent technical reports on this [NOB][DLP][DLH], with lots of links, facts, and references (including those cited below).
Some people have lost their titles or jobs due to plagiarism, e.g., Harvard's former president [PLAG7], but how can advisors now continue to tell their students that they should avoid plagiarism at all costs?
Of course, it is well known that plagiarism can be either "unintentional" or "intentional or reckless" [PLAG1-6], and the more innocent of the two may very well be partially the case here. But science has a well-established way of dealing with "multiple discovery" and plagiarism, be it unintentional [PLAG1-6][CONN21] or not [FAKE2], based on facts such as time stamps of publications and patents. The deontology of science requires that unintentional plagiarists correct their publications through errata and then credit the original sources properly in the future. The awardees didn't; instead they kept collecting citations for inventions of other researchers [NOB][DLP].
Details (see references in [NOB][DLP][DLH]):
1. The Lenz-Ising recurrent architecture with neuron-like elements was published in 1925 [L20][I24][I25]. In 1972, Shun-Ichi Amari made it adaptive such that it could learn to associate input patterns with output patterns by changing its connection weights [AMH1]. However, Amari is only briefly cited in the "Scientific Background to the Nobel Prize in Physics 2024" [Nob24a]. Unfortunately, Amari's net was later called the "Hopfield network." Hopfield republished it 10 years later [AMH2], without citing Amari, not even in later papers.
The Nobel Committee for Physics [Nob24a] briefly cites the important work of Nakano (1972)[NAK72][DLH], a former collaborator of Amari. Nakano also had a recurrent associative memory, but it had ternary activation values and wasn't the "Hopfield network" - sometimes called the "Amari-Hopfield Network" [AMH3] - first published by Amari (1972)[AMH1][DLH].
Remarkably, Hopfield [AMH2] was aware of Amari: he cites Amari's later papers on the separate topic of self-organisation in NNs (1977, 1978), but ignores his crucial 1972 paper [AMH1][DLH] (as well as Nakano's paper). See also Little's work (1974-1980) [AMH1b-d] on connecting the Lenz-Ising network (1920s)[L20][I24,I25] to learning neural networks (NNs).
The Amari network (1972) [AMH1] stores a finite number of patterns. Its connection weights are correlations of these patterns. A stored pattern is recalled from similar patterns through the neural dynamics. So the Amari net is essentially like the much later Hopfield net. The Nobel Prize is about “foundational discoveries that enable machine learning with artificial NNs.” Amari obviously published this "foundational discovery" 10 years before Hopfield, when compute was about 100 times more expensive. However, even Hopfield’s much later survey of "Hopfield networks" (Scholarpedia, 2007)[AMH4] failed to cite Amari (1972).
Note that making slight modifications doesn't let you ignore the work that introduced the key innovation. These are just the elementary rules of scientific publishing. Hopfield's own contributions build on the prior work: an analysis of storage capacity and a suitable Lyapunov function for sequential neuron updates instead of Amari's parallel updates (in practice, this hardly makes a difference [AM24]) to show that the "Hopfield network" settles into an equilibrium in response to static input patterns [AMH2]. These sequential updates and equilibrium nets are mostly irrelevant in modern AI, which uses massively parallel neuron updates and focuses heavily on sequence processing [DLH]. Note that Amari (1972) already had a sequence-processing generalization of the "Hopfield network" [AMH1]!
In 1984, Hopfield published an analog version [HOP84][Nob24a] that failed to cite Cohen and Grossberg's earlier work (1983)[GRO83], which described a Lyapunov function for the "Additive Model," later called the "Hopfield model." This publication built on Grossberg's even earlier work (1978) on the Additive Model [GRO78], which introduced a Lyapunov functional to help prove convergence. See also Grossberg's overviews [GRO20][GRO21]. Again, even Hopfield’s much later survey of "Hopfield networks" (Scholarpedia, 2007)[AMH4] did not mention this.
2. The related Boltzmann Machine paper by Ackley, Hinton, and Sejnowski (1985)[BM] was about learning internal representations in hidden units of neural networks (NNs)[S20]. It didn't cite the first working algorithm for deep learning of internal representations by Ivakhnenko & Lapa (Ukraine, 1965)[DEEP1-2][HIN]. It didn't cite Amari's separate work (1967-68)[GD1-2] on learning internal representations in deep NNs end-to-end through stochastic gradient descent (SGD). Not even the later surveys by the authors [S20][DL3][DLP] nor the "Scientific Background to the Nobel Prize in Physics 2024" mention these origins of deep learning.
The Boltzmann machine-like Sherrington-Kirkpatrick model (1975)[SK75] is based on the general Edwards-Anderson model (1975)[EA75][EA21] (with random connections driving the network dynamics) and "learns" optimal weights J_ij for minimising Free Energy, where each point in the phase-diagram corresponds to an “internal representation.” The 1985 Boltzmann Machine paper [BM] fails to cite both papers.
Hinton's Boltzmann Machine co-author Sejnowski was a student of Hopfield. He is also known for sending "amicus curiae" ("friend of the court") letters to award committees. He claims [S20]: "In 1969, Minsky & Papert [M69] showed that shallow NNs without hidden layers are very limited and the field was abandoned until a new generation of neural network researchers took a fresh look at the problem in the 1980s." This claim is echoed in the "Popular information" of the Nobel Committee. In an interview, Sejnowski also claimed: "Our goal was to try to take a network with multiple layers—an input layer, an output layer and layers in between—and make it learn. It was generally thought, because of early work that was done in AI in the 60s, that no one would ever find such a learning algorithm because it was just too mathematically difficult."
However, the 1969 book [M69] addressed a "problem" of Gauss & Legendre's shallow 1-layer NNs ("method of least squares," circa 1800) [DL1-2][DLH] that had already been solved 4 years prior by Ivakhnenko & Lapa's popular deep learning method (1965) [DEEP1-2][DL2], and then also by Amari's SGD for MLPs (1967)[GD1-2][DLH]. Minsky (who is cited by the "Scientific Background to the Nobel Prize in Physics 2024") was apparently unaware of this and failed to correct it later [DLH][HIN][DLP].
3. The Nobel Committee also lauds Hinton et al.'s 2006 method for layer-wise pretraining of deep NNs (2006)[UN4]. However, this work neither cited the original layer-wise training of deep NNs by Ivakhnenko & Lapa (1965)[DEEP1-2] nor the original work on unsupervised pretraining of deep NNs (1991)[UN0-1][DLP].
Ivakhnenko's 1971 paper [DEEP2] already described a deep learning net with 8 layers [DL2]. This depth is comparable to the depth of Hinton's 2006 nets [UN4] published without comparison to the original work [DEEP1-2][DL2] - done when compute was millions of times more expensive. Given a training set of input vectors with corresponding target output vectors, layers are incrementally grown and trained by regression analysis. In a fine-tuning phase, superfluous hidden units are pruned with the help of a separate validation set [DEEP2][DLH].
Hinton and Sejnowski have never cited the origins of deep learning in Ukraine and Japan in the 1960s and 1970s. None of the important algorithms for modern AI were invented by them.
4. The Physics Nobel Committee[Nob24a] tries very hard to give the impression that modern pattern recognition and deep learning are somehow based on physics-inspired NNs, but they aren't. The well-known backpropagation technique (Linnainmaa, 1970)[BP1-5][BPA-C][DLP] is an efficient way of applying the chain rule (Leibniz, 1676)[LEI07-10][DLH] to big networks with differentiable nodes [BP4]. It is much more important for modern AI than physics-inspired equilibrium nets such as the so-called "Hopfield network" and the "Boltzmann Machine" (which are irrelevant for modern AI applications mentioned by the Nobel Foundation [Nob24a]).
Backpropagation was also mentioned in the recent debate, although the Nobel Prize focuses on other things (otherwise the subsequent outcry would have been even greater). By 1985, compute was about 1,000 times cheaper than in 1970 [BP1], and the first desktop computers became accessible in wealthier academic labs. An experimental analysis of the known method [BP1-2] by Rumelhart et al. then demonstrated that backpropagation can yield useful internal representations in hidden layers of NNs [RUM].
Hinton claimed that Rumelhart invented backpropagation [AOI][HIN]. The "Scientific Background to the Nobel Prize in Physics 2024" [Nob24a], however, correctly cites Linnainmaa (1970) [BP1], who first published it, and Werbos (1982) [BP2], who first applied it to NNs [DLP][DLH][DL1]. It does NOT mention, however, that even later surveys by Hinton failed to cite the original work by Linnainmaa [DLP] and Amari's work (1967-68) on training networks with hidden layers through stochastic gradient descent (SGD) [GD1-2a]. Reference [BP4] has a compact history of backpropagation.
Note that Google Scholar (by Hinton's former employer) hallucinates 55k citations for a 1986 backpropagation paper by Rumelhart et al., simply adding 28k citations for the book in which it appeared.
It is remarkable that Amari and his student Saito [GD2][GD2a] applied SGD to deep NNs in 1967-68 when compute was billions of times more expensive than today.
5. Many additional cases of plagiarism and incorrect attribution can be found in [DLP]. One can start with Sec. 3.
In science, in the end, the facts must always win. As long as the facts have not yet won, it's not yet the end. No fancy award can ever change that [NOB][HIN][T22][DLP]. The Romans already knew: magna est veritas et praevalebit (truth is mighty and will prevail).
REFERENCES
[NOB] A Nobel Prize for Plagiarism. Technical Report IDSIA-24-24.
https://people.idsia.ch/~juergen/physics-nobel-2024-plagiarism.htmlSee also two popular tweets (0.5M+ views / 1M+ views)
https://x.com/SchmidhuberAI/status/1844022724328394780https://x.com/SchmidhuberAI/status/1865310820856393929and LinkedIn post (>1/3M views)
https://lnkd.in/eS92dg86[DLP] How 3 Turing awardees republished key methods and ideas whose creators they failed to credit.
Technical Report IDSIA-23-23, Swiss AI Lab IDSIA, 2023.
https://people.idsia.ch/~juergen/ai-priority-disputes.htmlSee also popular tweet (350k views)
https://x.com/SchmidhuberAI/status/1735313711240253567[DLH] Annotated History of Modern AI and Deep Learning. Technical Report IDSIA-22-22, IDSIA, Lugano, Switzerland, 2022. Preprint arXiv:2212.11279.
https://people.idsia.ch/~juergen/deep-learning-history.html. This extends the 2015 award-winning deep learning survey
https://people.idsia.ch/~juergen/deep-learning-overview.html. See also the tweet:
https://x.com/SchmidhuberAI/status/1606333832956973060 [BP4] Who invented back-propagation?
https://people.idsia.ch/~juergen/who-invented-backpropagation.html [PLAG1] Oxford's guide to types of plagiarism (2021). Quote: "Plagiarism may be intentional or reckless, or unintentional."
https://web.archive.org/web/20211227140321/https://www.ox.ac.uk/students/academic/guidance/skills/plagiarism [PLAG2] Jackson State Community College (2022). Unintentional Plagiarism.
[PLAG3] R. L. Foster. Avoiding Unintentional Plagiarism. Journal for Specialists in Pediatric Nursing; Hoboken Vol. 12, Iss. 1, 2007.
[PLAG4] N. Das. Intentional or unintentional, it is never alright to plagiarize: A note on how Indian universities are advised to handle plagiarism. Perspect Clin Res 9:56-7, 2018.
[PLAG5] InfoSci-OnDemand (2023). What is Unintentional Plagiarism?
[PLAG6] Copyrighted dot com (2022). How to Avoid Accidental and Unintentional Plagiarism (2023). Copy in the Internet Archive. Quote: "May it be accidental or intentional, plagiarism is still plagiarism."
[PLAG7] Cornell Review, 2024. Harvard president resigns in plagiarism scandal. 6 January 2024.
[FAKE] H. Hopf, A. Krief, G. Mehta, S. A. Matlin. Fake science and the knowledge crisis: ignorance can be fatal. Royal Society Open Science, May 2019. Quote: "Scientists must be willing to speak out when they see false information being presented in social media, traditional print or broadcast press" and "must speak out against false information and fake science in circulation and forcefully contradict public figures who promote it."
[FAKE2] L. Stenflo. Intelligent plagiarists are the most dangerous. Nature, vol. 427, p. 777 (Feb 2004). Quote: "What is worse, in my opinion, ..., are cases where scientists rewrite previous findings in different words, purposely hiding the sources of their ideas, and then during subsequent years forcefully claim that they have discovered new phenomena."
Juergen Schmidhuber
Co-Chair, KAUST Center of Generative AI
Scientific Director, Swiss AI Lab IDSIA
CV:
http://www.idsia.ch/~juergen/cv.html