Data Science for ALL(2024)

(2024)

This NSF-funded summer program teaches students the principles of data science and machine learning. Students will learn concepts about data modeling, data cleaning, data wrangling, and visualization. Students will learn basic Python programming for processing data and learn ML techniques, including basic concepts, training, classification, and sentiment analysis.

The program will use the open-source Texera platform to help students get familiar with these concepts even if they have a limited computing background. We will include a capstone project for students to learn these skills by analyzing real data (e.g., social media data) to apply the knowledge to conduct ML-based data science. The instructors and staff include professors and Ph.D. students from UCI and UCLA who are experts in data management, data science, and machine learning.

Summer 2024 DS4ALL Photos

Summer 2024 DS4ALL Students and Instructors

Faculty Instructors


Dr. Chen Li, Department of Computer Science, UCI

Dr. Wei Wang, Department of Computer Science, UCLA

PhD Student Instructors

Jiadong Bai, Department of Computer Science, UCI

Xinyuan Lin, Department of Computer Science, UCI


Yunyan Ding, Department of Computer Science, UCI

 


Anthony Cuturrufo, Department of Computer Science, UCLA

 


Alexander Kundu Taylor, Department of Computer Science, UCLA


program Details

  • Program time: 07/08/2024 – 07/19/2024
  • Daily Schedule: From 9 AM to 4 PM
  • Location: UCI Campus
  • Instruction Format: The program will consist of both lectures and lab sessions.
  • Fees: Free of charge (funded by NSF)
  • Lunch: Lunch will be provided during the program.
  • Deadline to apply: 04/15/2024 by 11:59PM (Closed)
  • Acceptance notification: Before 05/15/2024
  • Eligibility: Current sophomores (10th graders) and current juniors (11th graders)
  • Prerequisites: Algebra II or Integrated Math II
  • Contact email:

Program Schedule

DateDayMorning (Lecture)Afternoon (Lab)Slides
7/8/241Program Overview; Texera Platform Overview; Construct the 1st Data Analytics Workflow on TexeraForm teams; Project topic discussionSlide01
7/9/242Data and data wrangling concepts; Data science operators – Scan, Projection, Type Cast, Sort, FilterCollaborative project development in each teamSlide02
7/10/243Data science operators – Union, Distinct, Intersection, Diff, Aggregate, Join; Python basics; Python UDF tuple API basicsCollaborative project development in each teamSlide03
7/11/244Python intermediate; Python UDF tuple API exercise; Python UDF table API basicsCollaborative project development in each teamSlide04
7/12/245Python advance; Python UDF table API exercise; Machine Learning introduction and hands-on practice using ML operatorsCollaborative project development in each teamSlide05
7/15/246 Introduction to AI; Classification Collaborative project development in each team
7/16/247Machine Learning Foundations; Linear RegressionCollaborative project development in each team
7/17/248Neural NetworksCollaborative project development in each team
7/18/249Computer Vision, Probability, Natural Language ProcessingCollaborative project development in each team
7/19/2410ChatGPT: learn how to play around with itProject Showcase