Slides<\/a><\/p>\n\n\n\n\n\n11\/28\/2023:<\/strong> Textbooks Are All You Need, Yin Tat Lee<\/p>\n\n\n\n\n\nAbstract: <\/strong>Many believed that training large language models (LLMs) required using a vast dataset and an immense number of parameters. This is computationally demanding, requiring significant GPU resources. GPT-4 exemplified this belief, being a colossal model trained on a vast corpus.<\/p>\n\n\n\nIn light of this, we sought to discern if such impressive results could be achieved with smaller models and limited data for code generation. We demonstrate that with high-quality data, the demand for expansive datasets and a multitude of parameters lessens. The outcome was a few billion-size model, which not only met or exceeded the performance of existing open-source models but did so utilizing a mere 1\/1000th of compute in training. Moreover, we will discuss specific emergent properties observed in the model after its fine-tuning on coding exercises.<\/p>\n\n\n\n <\/figure>\n\n\n\nBio:<\/strong> Yin Tat Lee is a Principal Researcher at MSR and an Associate Professor of Paul G. Allen School of Computer Science & Engineering at the University of Washington. His research interests are convex optimization, convex geometry, graph algorithms, online algorithms, and differential privacy. During his career, he has received a variety of awards, including Best Paper Awards at FOCS, SODA and NeurIPS, Sprowls Award, NSF CAREER Award, A.W. Tucker Prize, Microsoft Research Faculty Fellowship, Sloan Research Fellowship, and Packard Fellowships.<\/p>\n\n\n\n\n\n10\/30\/2023:<\/strong> Intelligent Heuristics Are the Future of Computing, Shang-Hua Teng<\/p>\n\n\n\n\n\nAbstract: <\/strong>Back in 1988, the partial game trees explored by computer chess programs were among the largest search structures in real-world computing. Because the game tree is too large to be fully evaluated, chess programs must make heuristic strategic decisions based on partial information, making it an illustrative subject for teaching AI search. In one of his lectures that year on AI search for games and puzzles, Professor Hans Berliner \u2014 a pioneer of computer chess programs \u2014 stated: \u201cIntelligent heuristics are the future of computing.\u201d<\/p>\n\n\n\nAs a student in the field of the theory of computation, I was naturally perplexed but fascinated by this perspective. I had been trained to believe that \u201cAlgorithms and computational complexity theory are the foundation of computer science.\u201d However, as it happens, my attempts to understand heuristics in computing have subsequently played a significant role in my career as a theoretical computer scientist. I have come to realize that Berliner\u2019s postulation is a far-reaching worldview, particularly in the age of big, rich, complex, and multifaceted data and models, when computing has ubiquitous interactions with science, engineering, humanity, and society. <\/p>\n\n\n\n
In this talk, I will share some of my experiences on the subject of heuristics in computing, presenting examples of theoretical attempts to understand the behavior of heuristics on real data, as well as efforts to design practical heuristics with desirable theoretical characterizations. My hope is that these theoretical insights from past heuristics \u2014 such as spectral partitioning, multilevel methods, evolutionary algorithms, and simplex methods \u2014 can shed light on and further inspire a deeper understanding of the current and future techniques in AI and data mining.<\/p>\n\n\n\n <\/figure>\n\n\n\nBio:<\/strong> Shang-Hua Teng is a University Professor and Seely G. Mudd Professor of Computer Science and Mathematics at USC. He is a fellow of SIAM, ACM, and Alfred P. Sloan Foundation, and has twice won the G\u00f6del Prize, first in 2008, for developing smoothed analysis, and then in 2015, for designing the breakthrough scalable Laplacian solver. Citing him as, \u201cone of the most original theoretical computer scientists in the world\u201d, the Simons Foundation named him a 2014 Simons Investigator to pursue long-term curiosity-driven fundamental research. He also received the 2009 Fulkerson Prize, 2021 ACM STOC Test of Time Award (for smoothed analysis), 2022 ACM SIGecom Test of Time Award (for settling the complexity of computing a Nash equilibrium), 2011 ACM STOC Best Paper Award (for improving maximum-flow minimum-cut algorithms), and 2023 Science & Technology Award for Overseas Chinese<\/a> from the China Computer Federation. In addition, he and collaborators developed the first optimal well-shaped Delaunay mesh generation algorithms for arbitrary three-dimensional domains, settled the Rousseeuw-Hubert regression-depth conjecture in robust statistics, and resolved two long-standing complexity-theoretical questions regarding the Sprague-Grundy theorem in combinatorial game theory. For his industry work with Xerox, NASA, Intel, IBM, Akamai, and Microsoft, he received fifteen patents in areas including compiler optimization, Internet technology, and social networks. Dedicated to teaching his daughter to speak Chinese as the sole Chinese-speaking parent in an otherwise English-speaking family and environment, he has also become fascinated with children's bilingual learning.<\/p>\n\n\n\n\n\n10\/23\/2023:<\/strong> The mathematics of complex streamed data, Terry Lyons<\/p>\n\n\n\n\n\nAbstract:<\/strong> Complex streams of evolving data are better understood by their effects on nonlinear systems than by their values at times. The question of which nonlinear systems would seem to be context dependent, but it is not. Core to rough path theory is a simple universal nonlinear system that captures all the information needed to predict any response to any nonlinear system. This idealized mathematical feature set is known as the signature of the stream. Its abstract simplicity opens the possibilities for understanding and working with streams in the same context free way that calculators work with numbers. Signature-based techniques offer simple to apply universal numerical methods that are robust to irregular data and efficient at representing the order of events and complex oscillatory data. Specific software can be developed and then applied across many contexts. Signatures underpin prize winning contributions in recognizing Chinese handwriting, in detecting sepsis, and in generating financial data, and most recently in the ability to score streams as outliers against a corpus of normal streams. This principled outlier technology has emerged as a powerful unifying technique; it identifies radio frequency interference in astronomical data, brain injury from meg data.... The underpinning theoretical contributions span a range from abstract algebra and non-commutative analysis to questions of organization of efficient numerical calculation. See