Industry Newsroom

Bin Yu, Chancellor’s Distinguished Professor of Statistics and EECSM, University of California, Berkeley, Shares Her Achievements and Challenges

avatar

Written by: CDO Magazine Bureau

Updated 4:32 AM UTC, Mon July 10, 2023

post detail image

Bin Yu, Chancellor’s Distinguished Professor of Statistics and EECSM at University of California, Berkeley

CDO Magazine publishes everything outstanding that is happening in the world of data and analytics. We introduce you to remarkable data organizations and great leaders through our special lists and nominations. We work throughout the year to bring you the latest in what is breaking down barriers and setting trends in the world of data. Our lists recognize the tremendous work performed to advance the cause of data and analytics worldwide, and we showcase the thought leaders’ accomplishments in their specific lines of work.

Our Leading Data Academic Leaders List 2022 honors these great academic leaders, introducing them on a global platform where they share their insights and work, highlighting their significant successes in the previous year, the challenges they faced, and their aspirations and goals for 2022.

Introducing: Bin Yu, Chancellor’s Distinguished Professor of Statistics and EECSM at University of California, Berkeley (Berkeley, Calif., USA)

1. What were your most significant achievements in 2021, specifically in the last two years, and why? 

The most significant achievement in 2020 was the publication of my paper in PNAS with my former student, Karl Kumbier — “Veridical Data Science (PCS Framework),” PNAS, 2020 (QnAs with Bin Yu). This paper was 10 years in making and a substantial expansion on my 2013 paper "Stability" in the journal Bernoulli.

We propose in the paper the Predictability Computability and Stability (PCS) framework (with documentation) for veridical (truthful) data science and provide guidelines for a quality-controlled data science life cycle from problem formulation, data cleaning, EDA, and algorithm development, to data conclusions. PCS unifies, streamlines, and expands on ideas and best practices of statistics and ML. Because of the central role of data science in AI, I believe the PCS framework is both important and practical for developing trustworthy AI to maximize the promise of AI and minimize its dangers. In particular, stability of data conclusions to perturbations in the data science life cycle because of different (legitimate) human judgment calls resulting in, say, different cleaned data versions, different data perturbations and different supervised algorithms. Stability is an expanded concept of reproducibility and a minimum requirement for interpretability, and reliability and trustworthiness.

2. What challenges are you facing in the academic data field? 

There is still a considerable gap between academic data science research and practice/impact. Narrowing this gap is very challenging because large scale data and/or domain problems are often not accessible by academic data scientists, and the pressure on publications is not incentivizing academic data scientists to seek impact or real-world problem solving over paper publication.

3. What traits and qualities are required to be a successful academic data leader? 

Vision and out-of-box thinking (both technical and societal), integrity, fair credit sharing, calculated risk taking, passion for solving real problems, and love and good skills for collaboration and teamwork.

4. Tell us about your priorities in 2022. What are your key targets? 

I hope to finish my textbook "Veridical Data Science" (with my former student and current postdoc Rebecca Barter).

It is built on the PCS framework and our research and teaching experiences at Berkeley. We have a contract with MIT Press, which kindly allows us to have a free interactive on-line copy. We intend to make it accessible to a wide-ranging audience, including upper-div and beginning graduate students in data science, stats, and CS, and also people who want to get into data science. We emphasize critical thinking and cover the whole data science life cycle in the same order as one solves a data science problem in the wild.

5. What advice would you offer aspiring academic leaders to help them prepare for the role? 

Don’t be afraid of risks and failures. Get out of your comfort zone, be open-minded and take the growth approach to learning, and be generous as a person.

Bin Yu is an American Statistician and Data Scientist at University of California at Berkeley. She holds faculty appointments in the departments of Statistics, and Electrical Engineering and Computer Sciences (EECS). Prof. Yu’s research focuses on practice, algorithm, and theory of statistical machine learning and causal inference. Yu holds a BS in mathematics from Peking University, and an MS and PhD in statistics from UC Berkeley. She has previously held positions as assistant professor at University of Wisconsin, Madison, visiting assistant professor at Yale University, technical staff member at Lucent Bell Labs, and has been a visiting faculty at numerous universities around the word, including MIT. She is a member of the National Academy of Sciences and the American Academy of Arts and Sciences.

Bin Yu is one of our Leading Academic Data Leaders in 2022. To discover the full list, click here now.

Related Stories

July 16, 2025  |  In Person

Boston Leadership Dinner

Glass House

Similar Topics
AI News Bureau
Data Management
Diversity
Testimonials
background image
Community Network

Join Our Community

starStay updated on the latest trends

starGain inspiration from like-minded peers

starBuild lasting connections with global leaders

logo
Social media icon
Social media icon
Social media icon
Social media icon
About