BiographyCharu Aggarwal is a Distinguished Research Staff Member (DRSM) at the IBM T. J. Watson ResearchCenter in Yorktown Heights, New York. He completed his Bachelor of Technology in Computer Science from the Indian Institute of Technology at Kanpur in 1993 and his PhD in Operations Research (focus: mathematical optimization) from theMassachusetts Institute of Technology in 1996. He has worked extensively in the field of data mining, with particular interests in data streams, privacy, uncertain data and social network analysis.He has authored 10 books, over 400 papers in refereed venues, and has applied for or been granted over 80 patents. His h-index is 136.Because of the commercial value of his patents,he has received several invention achievement awards and has been designated a Master Inventor at IBM.He has received an IBM Outstanding Technical Achievement Award and a Research Division Award for his contributions to System S, which was the first research prototype of IBM's streaming product (IBM Infosphere Streams). He is a recipient of an IBM Corporate Award (2003) for his work on bio-terrorist threat detection in data streams, a recipient of the IBM Outstanding Innovation Award (2008) for his scientific contributions to privacy technology, and a recipient of two IBM Outstanding Technical Achievement Awards (2008) for his scientific contributions to high-dimensional and data stream analytics. He has received two best paper awards and an EDBT Test-of-Time Award (2014). He is a recipient of the IEEE ICDM Research Contributions Award (2015) and the ACM SIGKDD Innovation Award (2019), which are the two most prestigious awards for influential research in data mining. He is also a recipient of the W. Wallace McDowell Award , the highest award given by the IEEE Computer Society across the field of computer science. He also received the ACM SIGKDD Service Award for service contributions to the data mining community.He has served as the general or program co-chair of the IEEE Big Data Conference (2014), the ICDM Conference (2015), the ACM CIKM Conference (2015), and the KDD Conference (2016). He has served as the editor-in-chief of the ACM SIGKDD Explorations and is currently an editor-in-chief of the ACM Transactions on Knowledge Discovery and Data Mining as well as that of ACM Books. He is serving or has served as associate editor/action editor of several premier journals including the IEEE Transactions on Knowledge and Data Engineering, the IEEE Transactions on Big Data, the Data Mining and Knowledge Discovery Journal , and the Knowledge and Information Systems Journal. He received the IIT Kanpur Distinguished Alumnus Award in 2023. He is a fellow of the IEEE (2010), ACM (2013), and the SIAM (2015) for "contributions to knowledge discovery and data mining algorithms." DBLP Publication Profile Google Scholar Citation Profile C.V. Twitter Profile LinkedIn Profile You can download the postscript/PDF files of my frequently accessed papers from my publication page. A more comprehensive list of publicationsis available from the DBLP database maintained by Michael Ley. Contact Information: Charu Aggarwal IBM T. J. Watson Research Center, 1101 Kitchawan Rd, Yorktown, NY 10598 Email: charu (at) us (dot) ibm (dot) comIn case you have sent me an email at my earlier address withdomain name
watson.ibm.com, it is likely that I have not received it.
This book covers the broader field of artificial intelligence. The book carefully balances coverage between classical AI (logic or deductive reasoning) and modern AI (inductive learning and neural networks). Thechapters of this book span three categories: Deductive reasoning methods:These methods start with pre-defined hypotheses and reason with them inorder to arrive at logically sound conclusions. The underlying methodsinclude search and logic-based methods. These methods are discussedin Chapters 1 through 5. Inductive learning methods: These methods start with examplesand use statistical methods in order to arrive at hypotheses. Examples includeregression modeling, support vector machines, neural networks, reinforcementlearning, unsupervised learning, and probabilistic graphical models.These methods are discussed in Chapters 6 through 11. Integrating reasoning and learning: Chapters 12 and 13 discusstechniques for integrating reasoning and learning. Examples include the useof knowledge graphs and neuro-symbolic artificial intelligence.The book is available in both hardcopy (hardcover) and electronicversions.The hardcover is available at all the usual channels (e.g, Amazon,Barnes and Noble etc.), in Kindle format, and also directly fromSpringer in hardcopy and pdf format. PDF versions do have links and work with e-readers (including the kindle reader). The PDF version (bought directly from springer) provides better formatting of equations than the kindle version and has an almost identical layout and pagination to the hardcopy on the e-reader. LINEAR ALGEBRA AND OPTIMIZATION FOR MACHINE LEARNING: A TEXTBOOK
A frequent challenge faced by beginners in machine learning is theextensive background requirement in linear algebra and optimization.This makes the learning curve very steep. This book, therefore, reverses the focus by teaching linear algebra and optimization as the primary topics ofinterest, and solutions to machine learning problems as applications ofthese methods. Therefore, the book also provides significant exposureto machine learning. Thechapters of this book belong to two categories:
Linear algebra and its applications: These chaptersfocus on the basics of linear algebra together with their commonapplications to singular value decomposition,similarity matrices (kernel methods), and graph analysis. Numerousmachine learning applications have been used as examples, such as spectral clustering,kernel-based classification, and outlier detection.
Optimization and its applications: Basic methods inoptimization such as gradient descent, Newton's method, and coordinatedescent are discussed. Constrained optimization methods are introducedas well. Machine learning applications such as linear regression, SVMs, logistic regression, matrix factorization, recommender systems, and K-means clustering arediscussed in detail.A general view of optimization in computational graphs is discussedtogether with its applications to backpropagation in neuralnetworks.
Lecture on backpropagation based on book presentation in Chapter 2 (provides a somewhat different approach to explaining it than you would normally see in textbooks): This is the second edition of the popular neural networks and deep learning textbook. The book discusses the theory and algorithms of deep learning. The theory and algorithms of neural networks are particularly important for understanding important concepts in deep learning, so that one can understand the important design concepts of neural architectures in different applications. Why do neural networks work? When do they work better than off-the-shelf machine learning models? When is depth useful? Why is training neural networks so hard? What are the pitfalls? Even though the book is not implementation-oriented, it is rich in discussing different applications. Applications associated with many different areas like recommender systems, machine translation, captioning, image classification, graph neural networks, reinforcement-learning based gaming, and text analytics are covered. The second edition is a significant update over the first edition, with material on graph neural networks, attention mechanisms, adversarial learning, attention mechanisms, transformers, and large language models. All chapters have been revised sigificantly. Detailed chapters on backpropogation and graph neural networks were added.The following aspects are covered in the book:
The basics of neural networks: Chapters 1, 2, and 3discuss the basics of neural network design and also thefundamentals of training them. The simulation of various machinelearning models with neural networks is provided. Examples include least-squares regression,SVMs, logistic regression, Widrow-Hoff learning, singular value decomposition,and recommender systems. Recent models like word2vec are also explored, togetherwith their connections with traditional matrix factorization. Exploring the interfacebetween machine learning and neural networks is important because itprovides a deeper understanding of how neural networks generalize known machinelearning methods, and the cases in which neural networkshave advantages over traditional machine learning.
Challenges in training neural networks: AlthoughChapters 1 and 2 provide an overview of the training methods forneural networks, a more detailed understanding of the trainingchallenges is provided in Chapters 4 and 5. In particular, issuesrelated to network depth and also overfitting are discussed. Chapter 6 presentsa classical architecture, referred to as radial-basis function networks. Even though this architecture is no longer used frequently, it is important because it represents a direct generalization of the kernel support-vector machine.
Advanced architectures and applications: A lot of thesuccess in neural network design is a result of the specializedarchitectures for various domains and applications. Examples ofsuch specialized architectures include graph neural networks, recurrent neural networks andconvolutional neural networks. Since the specialized architecturesform the key to the understanding of neural network performance invarious domains, most of the book will be devoted to this setting.Several advanced topics like deep reinforcement learning, neural Turing mechanisms, and generativeadversarial networks are discussed.
Some of the ``forgotten'' architectures like RBF networks andKohonen self-organizing maps are included because of their potential in many applications.The book is written for graduate students, researchers, andpractitioners. The book does require knowledge of probability andlinear algebra. Furthermore basic knowledge of machine learning ishelpful. Numerous exercises are available along with a solutionmanual to aid in classroom teaching. Where possible, anapplication-centric view is highlighted in order to give the readera feel for the technology.
3a8082e126