π Exciting News (September 2024): I'm releasing a v0 of DocETL, a system for complex LLM-powered document processing! I have been working on this for ~1.5 years. Check it out at docetl.org.
I'm also looking for undergrads who are interested in joining the project. If you're interested, please reach out! Your take-home interview is to close a GitHub issue or improve documentation. Feel free to email me if you get stuck.
About Me
I'm Shreya Shankar, a computer scientist in the Bay Area. I am completing my PhD in AI-powered data processing, with a human-centered focus, advised by Dr. Aditya Parameswaran. I am grateful to be supported by the NDSEG Fellowship. I am studying at UC Berkeley. Go Bears! π»
I also consult on ML engineering and production AI strategy for enterprises. Prior to my PhD, I was the first ML engineer at a startup, did research engineering at Google Brain, and engineering at Facebook. Before all of that, I did my BS and MS in computer science at Stanford. Go Trees! π²
Click to show/hide full bio for speaking engagements
π Bio (for speaking engagements, etc.)
Shreya Shankar is a PhD student in computer science at UC Berkeley, advised by Dr. Aditya Parameswaran. Her research addresses data challenges in production ML pipelines through a human-centered lens, focusing on data quality, observability, and more recently, leveraging large language models for data preprocessing. Shreya's work has appeared in top data management and HCI venues, including SIGMOD, VLDB, CIDR, CSCW, and UIST. She is a recipient of the NDSEG Fellowship and co-organizes the DEEM workshop at SIGMOD, which focuses on data management in end-to-end machine learning. Prior to her PhD, Shreya worked as an ML engineer and completed her undergraduate degree in computer science at Stanford University. In her free time, she enjoys roasting coffee and is actively trying to reduce her Twitter usage.
π° News and Industry Impact
Recent News
- Our paper (Who Validates the Validators) got into UIST 2024! Presenting in September. πΊπΈ
- Our paper (Operationalizing Machine Learning: An Interview Study) got into CSCW 2024! Presenting in November. π¨π·
Companies That Like Our Work π
π£οΈ Selected Invited Talks
Upcoming
- Academia: Sherry Tongshuang Wu's WInE Lab at CMU
- Industry: "Search in the LLM Era" Course
Recent
π¨βπ« Mentorship
I am fortunate to work with many talented students at UC Berkeley. Below is a list of students I am currently mentoring or have mentored for a year or more.
Current Students
- Reya Vir (UC Berkeley undergraduate) - Working on a benchmark for synthesizing data quality constraints for LLM applications.
- Quentin Romero Lauro (University of Pittsburgh undergraduate, REU at UC Berkeley) - Developing interfaces for iterating on retrieval-augmented generation (RAG) architectures for LLM applications.
- Rachel Lin (UC Berkeley master's student) - Developing interfaces for iterative dataset search with LLMs; co-mentored with Madelon Hulsebos.
Past Students
- Parth Asawa (former UC Berkeley undergraduate) - Worked on data quality constraints for LLM applications and declarative LLM workflows. Now pursuing a PhD at UC Berkeley.
- Yujie Wang (former UC Berkeley undergraduate) - Worked on monitoring ML performance metrics without ground-truth labels. Now at Google.
- Aditi Mahajan (former UC Berkeley undergraduate) - Worked on unit tests for end-to-end ML pipelines. Now at Google.
π¬ Contact
Email: shreyashankar@berkeley.edu
Twitter | Github