脑心智学院夏季班和国际脑心智会议 2015 （BMI Summer School and ICBM 2015)

67 views

Skip to first unread message

Brain Mind

unread,

Apr 6, 2015, 5:48:09 AM4/6/15

to mlc...@googlegroups.com, Juyang Weng

国家政府和上海市政府正在准备发布脑项目的指南。目标大致是认识脑，保护脑，模拟脑。

很多同事和同学都可能会想：我们是不知道大脑是怎么工作的。我们只要用机器方法就行了。我在当研究生的时候也是这么想的，但现在我知道我想错了。

第一，人类已经大致地在计算方法上了解了脑的工作原理了（请看下面的综述：Why BMI), 但脑的工作原理很不像你所熟悉的机器学习方法、神经网络方法、人工智能方法、机器人方法和美国几个大公司最近所用的方法等，但又有联系。应该说我所知道的主要机器学习的难题都有了已知的脑对应的好方法。

第二，需要上几门课来学脑的学习方法。脑心智学院（Brain-Mind Institute) 就是为了这个需要在2012开创的。脑心智学院的夏季班和国际脑心智会议在2015年已经进入了第四个年头。
http://www.brain-mind-institute.org/。我们向脑心智学院 2012班，2013班和2014班的同学们祝贺！
http://www.brain-mind-institute.org/Classes.html

2015年，你可以自己参加和推荐你的同事、学生、同学参加。我已经有五十多岁了，2012年，我是脑心智学院夏季班BMI811（生物学）和BMI821（脑科学）两门课上的最年长的学生。

脑心智学院的课可以到美国去上，也可以通过远程学习上。中国大陆属于B区，有50% 的打折。

Mon. June 1, 2015 - Fri. June 19, 2015 (3 weeks):
BMI 831 Cognitive Science for Brain-Mind, East Lansing, Michigan USA
Instructor: Gonzalo Munevar

Sat. June 20 - Sun. 21, 2015 (two days):
BMI International Conference on Brain-Mind (ICBM), East Lansing, Michigan USA

Mon. June 22, 2015 - Fri. July 10, 2015 (3 weeks):
BMI 861 Brain Automata Theory, East Lansing, Michigan USA
Instructor: Juyang Weng
This is a new course, meant for those who do not major in computer science.

Mon. July 20, 2015 - Fri. August 7, 2015 (3 weeks):
BMI 871 Computational Brain-Mind, East Lansing, Michigan USA
Instructor: Juyang Weng

Course applications (to get admitted so that you can register): by Sunday, April 12, 2015
ICBM full papers: by Sunday, April 19, 2015
ICBM abstracts: by Sunday, April 26, 2015
Notification of admission: Monday, April 27, 2015
Advance registration: by Sunday, May 10, 2015
International Conference on Brain Mind (ICBM): June 20-21, 2015

Those who prefer to learn at their own pace can get the corresponding course packages.
BMI 831 and BMI 871 course packages are now available at $970 each.
BMI 861 course packages will be available after July 10, 2015.

Why BMI

The Brain-Mind Institute (BMI) has been established to faciliate the communication, education, and research on the science of our brains, including how each individual brain works and how brain groups work together.

Historically, public acceptance of science was slow. For example, Charles Darwin waited about 20 years (from the 1830s to 1858) to publish his theory of evolution for fear of public reaction. About 20 years later (by the 1870s) the scientific community and much of the general public had accepted evolution as a fact. Of course, the debate on evolution still goes on today.

Is the public acceptance of science faster in modern days? Not necessarily so, even though we have now better and faster means to communicate. The primary reason is still the same but much more severe—the remaining open scientific problems are more complex and the required knowledge goes beyond a typical single person.

For instance, network-like brain computation — connectionist computation (e.g., J. McClelland and D. Rumelhart, Parallel Distributed Processing, 1986) — has been long doubted and ignored by industry. Deep convolutionalnetworks appeared by at least 1980 (K. Fukushima). Max-pooling technique for deep convolutional networks was published by 1992 (J. Weng et al.). However, Apple, Baidu, Google, Microsoft, Samsung, and other major related companies did not show considerable interest till after 2012. That is a delay of about 20 years. The two techniques above are not very difficult to understand. However, these two suddenly hot techniques have already been proved obsolete by the discoveries of more fundamental and effective principles of the brain, six of which are intuitively explained below.

Industrial and academic interests have been keen on a combination of two things — easily understandable tests (e.g., G. Hinton et al. NIPS 2012, congratulations!) and major companies are involved (e.g., Google, thanks!). We have read statements like “our results can be improved simply by waiting for faster GPUs and bigger datasets to become available” (G. Hinton et al. NIPS 2012). However, the newly known brain principles have told us that the ways to conduct such tests (e.g., ImageNet) will give only vanishing gains that do not lead to a human-like zero error rate, regardless how long the Moore’s Law can continue and how many more static images are added to the training set. Why? All such tests used static images in which objects mix with the background. Such tests therefore prevent participating groups from seriously considering autonomous object segmentation (free of handcrafted object model). Through synapse maintenance (Y. Wang et al. ICBM 2012), neurons in a human brain automatically cut off inputs from background pixels if background pixels matched badly compared with attended object pixels. Our babies spend much more time in dynamic physical world than seeing static photos.

Our industry should learn more powerful brain mechanisms that went beyond conventional well-known, well-tested techniques. The following gives some examples:

(1) Deep Learning Networks (e.g., J. Weng et al. IJCNN 1992, Y. LeCun et al. Proceedings of IEEE 1998, G. Hinton et al. NIPS 2012) are not only biologically implausible but also functionally weak. The brain uses a rich network of processing areas (e.g., Felleman & Van Essen, Cerebral Cortex 1991) where connections are almost always two-way, not a cascade of modules like the Deep Learning Networks. Such a Deep Learning Network is not able to conduct top-down attention in a cluttered scene (e.g., attention to location or type in J. Weng, Natural and Artificial Intelligence, 2012 or attention to more complex object shape as reported in L. B. Smith et al. Developmental Science 2005).

(2) Convolution (e.g., J. Weng et al. IJCNN 1992, Y. LeCun et al. Proceedings of IEEE 1998, G. Hinton et al. NIPS 2012) is not only biologically implausible, but also computationally weak. Why? All feature neurons in the brain carry not only sensory information but also motor information (e.g., Felleman & Van Essen, Cerebral Cortex 1991) so that later-processing neurons become less concrete and more abstract --- which is impossible to accomplish using the shift-invariant convolution. Namely, convolution is always location-concrete (even using max-pulling) and never location-abstract.

(3) Error back-propagation in neural networks (e.g., Y. LeCun et al. Proceedings of IEEE 1998, G. Hinton et al. NIPS 2012) is not only biologically implausible (e.g., a baby does not have error in his motors) but also damaging to long-term memory because of its lack of match-based competition for error-causality (such as those in SOM, LISSOM, and LCA as optimal SOM). Even though the gradient vector identifies a neuron that can reduce the current error, the current error is not the business of that neuron at all and it must keep its own long-term memory unchanged. That is why error back-propagation is well known to be bad for incremental learning and requires research assistants to try many guesses of initial weights (i.e., using the test set as the training set!). Let us not be blinded by artificially low error rates.

Do our industry and public need another 20 years?

On the other hand, neuroscience and neuropsychology have made many advances by providing experimental data (e.g., Felleman & Van Essen, Cerebral Cortex 1991). However, it has been well recognized that these disciplines are data-rich and theory-poor. The phenomena of brain circuits and brain behavior are extremely rich. Many researchers in these areas use only local tools (e.g., attracters that can only be attracted into local extrema) and consequently have been overwhelmed by the richness of brain phenomena. A fundamental reason is that they miss the guidance of the global automata theory of computer science, although previous automata do not emerge. For example, X. -J. Wang et al. Nature 2013 stated correctly that neurons of mixed selectivity were rarely analyzed but have widely observed. However, the mixed selectivity has already been well explained, as a special case, by the new Emergent Turing Machine in Developmental Networks in a theoretically complete way. The traditional Universal Turing Machine is a theoretical model for modern-day computers --- how computers work --- but they do not emerge. The mixed selectivity of neurons in such a new kind of Turing Machine are caused by emergent and beautiful brain circuits, but each neuron still uses a simple similarity of inner product in its high dimensional and dynamic input space.

October 2011, a highly respected multi-disciplinary professor kindly wrote: “I tell these students that they can work on brains and do good science, or work on robots and do good engineering. But if they try to do both at once, the result will be neither good science nor good engineering.” How long does it take for the industry and public to accept that the pessimistic view of the brain was no longer true even then?

The brain principles that have already been discovered could bring fundamental changes in the way humans live, the way countries and societies are organized, our industry, our economy, and the way humans treat one another.

The known brain principles have told us that the brain of anybody, regardless of his education and experience, is fundamentally shortsighted, in both space and time. Prof. Jonathan Haidt documented well such shortsightedness in his book “The Righteous Mind: Why Good People Are Divided by Politics and Religion”, although not in terms of brain computation.

In terms of brain computation, the circuits in your brain self-wire beautifully and precisely according to your real-time experience (the genome only regulates) and their various invariance properties required for abstraction also largely depend on experience. Serotonin (caused by, e.g., threats), dopamine (caused by e.g., praise), and other neural transmitters quickly bias these circuits so that neurons for more long-term thoughts lost in competition to fire. Furthermore, such bias has a long-term effect. Therefore, you make long-term mistakes but you still feel you are right. Everybody is like that. Depending on experience, shortsightedness varies in terms of subject matter.

Traditionally, many domain experts think that computers and brain appear to use very different principles. Naturally emerging Turing Machine in Developmental Networks that has been mathematically proved (see J. Weng, Brain as an Emergent Finite Automaton: A Theory and Three Theorems, IJIS, 2015) should change our intuition.
The new result proposed the following six brain principles:

The developmental program (genome-like, task-nonspecific) regulates the development (i.e., lifetime learning) of a task-nonspecific “brain-like” network —— Developmental Network. The Developmental Network is of general-purpose—can learn any body-capable tasks, in principle. Not only pattern recognition.
The brain’s images are naturally sensed images of cluttered scenes where many objects mix. In typical machine training (e.g., G. Hinton et al. NIPS 2012), each training image has a bounding box drawn around each object to learn, which is not the case for a human baby. Neurons in the Developmental Network automatically learn object segmentation through synapse maintenance.
The brain’s muscles have multiple subareas where each subarea represents either declarative knowledge (e.g., abstract concepts such as location, type, scale, etc.) or non-declarative knowledge (e.g., driving a car or riding a bicycle). Not just discrete class labels in global classification.
Each brain in the physical world is at least is a Super Turing Machine in a Developmental Network. Every area in the network emerges (does not statically exist, see M. Sur et al. Nature 2000 and P. Voss, Frontiers in Psychology 2013) using a unified area function whose feature development is nonlinear but free of local minima, contrary to engineering intuition --- not convolution; not error back-propagation.
The brain’s Developmental Network learns incrementally—taking one-pair of sensory pattern and motor pattern at a time to update the “brain” and discarding the pair immediately after. Namely, a real brain has only one pair of stereoscopic retinas which cannot store more than one pair of image. Batch learning (i.e., learn before test) is not scalable: Without a mistake in an early test, a student cannot learn how to correct the mistake later.
The brain’s Developmental Network is always optimal—Each network update in real time computes the maximum likelihood estimate of the “brain”, conditioned on the limited computational resources and the limited learning experience in its “life” so far. One should not use the test set as a training set: report only the best network after trying many networks on the test set.

The logic completeness of a brain is (partially, not all) understood by a Universal Turing Machine in a Developmental Network. This emergent automaton brain model proposes that each brain is an automaton, but also very different from all traditional symbolic automata because it programs itself—emergent. No traditional Turing Machine can program itself but a brain Turing Machine does.

The automaton brain model has predicted that brain circuits dynamically and precisely record the statistics of experience, roughly consistent with neural anatomy (e.g., Felleman & Van Essen, Cerebral Cortex, 1991). In particular, the model predicted that “shifting attention between `humans’ and `vehicles’ dramatically changes brain representation of all categories” (J. Gallant et al. Nature Neuroscience, 2013) and that human attention “can regulate the activity of their neurons in the medial temporal lobe” (C. Koch et al. Nature, 2010). The “place” cells work of the 2014 Nobel Prize in Physiology or Medicine implies that neurons encode exclusively bottom-up information (place). The automaton brain model challenges such a view: Neurons represent a combination of both bottom-up (e.g., place) and top-down context (e.g., goal) as reported by Koch et al. and Gallant et al.

Unfortunately, the automaton brain model implies that all neuroscientists and neural network researchers are unable to understand the brain of their studies without a rigorous training in automata theory. For example, traditional models for nervous systems and neural networks focus on pattern recognition and do not have the capabilities of a grounded symbol system (e.g., “rulefully combining and recombining,” Stevan Harnad, Physica D, 1990). The automata theory deals with such capabilities. Does this new knowledge stun our students and researchers or guide them so their time is better spent?

The Brain-Mind Institute aims to prepare everybody for the up-coming new brain era, so that he would not fall behind regardless where he is located in the world.

有问题请和我联系。

翁巨扬

--
Juyang (John) Weng, Professor
Department of Computer Science and Engineering
MSU Cognitive Science Program and MSU Neuroscience Program
428 S Shaw Ln Rm 3115
Michigan State University
East Lansing, MI 48824 USA
Tel: 517-353-4388
Fax: 517-432-1061
Email: we...@cse.msu.edu
URL: http://www.cse.msu.edu/~weng/
----------------------------------------------