2024 Program details
- Dates: Monday, June 3 – Friday, June 28, 2024 (4 weeks)
- Timing: Mondays to Fridays, 10:00 AM – 5:00 PM
- Format: In-person
- Number of students selected: 12
- Each student who is selected will receive a $3,000 stipend as well as a laptop for their participation in the program.
Target audience
Upper-level undergraduate students who are currently enrolled in a New York City-area college, who are interested in attending graduate school in computer science and related fields and would benefit from an intensive introduction to data science. One goal of this program is to help build a more diverse and inclusive computer-science community, and we strongly encourage those from diverse, non-traditional, and under-represented backgrounds in STEM to apply.
Course description
This introduction to data science will cover tools and techniques for acquiring, cleaning, and utilizing real-world data for research purposes. In contrast to traditional course work, where one is often handed a prepackaged dataset obtained by a third party and prepared for a specific exercise, research projects often involve not only cleaning and preparing “messy” data, but often also acquiring that data oneself (e.g., through an API). The initial phase of these projects involves a good deal of exploratory analysis to gain a preliminary understanding of the dataset. Students will be introduced to scripting (on the command line and with Python and R) for these purposes and will gain direct experience in acquiring and modeling data from online sources.
The course also serves as an introduction to problems in applied statistics and machine learning. We will cover the theory behind simple but effective methods for supervised and unsupervised learning. Emphasis will be on formulating real-world modeling and prediction tasks as optimization problems and comparing methods in terms of practical efficacy and scalability. Students will learn to fit and evaluate such models, with applications including spam filtering and recommendation systems. Course material from previous years is available on GitHub (opens in new tab).
During the last part of the course, students will work on an original research project in groups, led by Microsoft Research scientists.
Microsoft’s Event Code of Conduct
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. This includes events Microsoft hosts and participates in, where we seek to create a respectful, friendly, and inclusive experience for all participants. As such, we do not tolerate harassing or disrespectful behavior, messages, images, or interactions by any event participant, in any form, at any aspect of the program including business and social activities, regardless of location.
We do not tolerate any behavior that is degrading to any gender, race, sexual orientation, or disability, or any behavior that would violate Microsoft’s Anti-Harassment and Anti-Discrimination Policy, Equal Employment Opportunity Policy, or Standards of Business Conduct (opens in new tab). In short, the entire experience at the venue must meet our culture standards. We encourage everyone to assist in creating a welcoming and safe environment. Please report (opens in new tab) any concerns, harassing behavior, or suspicious or disruptive activity to venue staff, the event host or owner, or event staff. Microsoft reserves the right to refuse admittance to or remove any person from company-sponsored events at any time at its sole discretion.