Cover art for podcast Datacast


73 EpisodesProduced by James LeWebsite

Datacast follows the narrative journey of data practitioners and researchers to unpack the career lessons they learned along the way. James Le hosts the show.


Episode 64: Improving Access to High-Quality Data with Fabiana Clemente

Show Notes
  • (02:06) Fabiana talked about her Bachelor’s degree in Applied Mathematics from the University of Lisbon in the early 2010s.
  • (04:18) Fabiana shared lessons learned from her first job out of college as a Siebel and BI Developer at Novabase.
  • (05:13) Fabiana discussed unique challenges while working as an IoT Solutions Architect at Vodafone.
  • (09:56) Fabiana mentioned projects she contributed to as a Data Scientist at startups such as ODYSAI and Habit Analytics.
  • (12:44) Fabiana talked about the two Master’s degrees she got while working in the industry (Applied Econometrics from Lisbon School of Economics and Management and Business Intelligence from NOVA IMS Information Management School).
  • (14:41) Fabiana distinguished the difference between data science and business intelligence.
  • (18:01) Fabiana shared the founding story of YData, the first data-centric platform with synthetic data, whose she is currently the Chief Data Officer.
  • (21:32) Fabiana discussed different techniques to generate synthetic data, including oversampling, Bayesian Networks, and generative models.
  • (24:01) Fabiana unpacked the key insights in her blog series on generating synthetic tabular data.
  • (29:40) Fabiana summarized novel design and optimization techniques to cope with the challenges of training GAN models.
  • (33:44) Fabiana brought up the benefits of using Differential Privacy as a complement to synthetic data generation.
  • (38:07) Fabiana unpacked her post “The Cost of Poor Data Quality,” — where she defined data quality as data measures based on factors such as accuracy, completeness, consistency, reliability, and above all, whether it is up to date.
  • (42:11) Fabiana explained the important role that data quality plays in ensuring model explainability.
  • (44:57) Fabiana reasoned about YData’s decision to pursue the open-source strategy.
  • (47:47) Fabiana discussed her podcast called “When Machine Learning Meets Privacy” in collaboration with the MLOps Slack community.
  • (49:14) Fabiana briefly shared the challenges encountered to get the first cohort of customers for YData.
  • (50:12) Fabiana went over valuable lessons to attract the right people who are excited about YData’s mission.
  • (51:52) Fabiana shared her take on the data community in Lisbon and her effort to inspire more women to join the tech industry.
  • (53:47) Closing segment.
Fabiana’s Contact InfoYData’s ResourcesMentioned Content

Blog Posts



  • Jean-Francois Rajotte (Resident Data Scientist at the University of British Columbia)
  • Sumit Mukherjee (Associate Professor of Statistics at Columbia University)
  • Andrew Trask (Leader at OpenMined, Research Scientist at DeepMind, Ph.D. Student at the University of Oxford)
  • Théo Ryffel (Co-Founder of Arkhn, Ph.D. Student at ENS and INRIA, Leader at OpenMined)

Recent Announcements/Articles

Educational emoji reaction


Interesting emoji reaction


Funny emoji reaction


Agree emoji reaction


Love emoji reaction


Wow emoji reaction


Listen to Datacast


A free podcast app for iPhone and Android

  • User-created playlists and collections
  • Download episodes while on WiFi to listen without using mobile data
  • Stream podcast episodes without waiting for a download
  • Queue episodes to create a personal continuous playlist
RadioPublic on iOS and Android
Or by RSS
RSS feed

Connect with listeners

Podcasters use the RadioPublic listener relationship platform to build lasting connections with fans

Yes, let's begin connecting
Browser window

Find new listeners

  • A dedicated website for your podcast
  • Web embed players designed to convert visitors to listeners in the RadioPublic apps for iPhone and Android
Clicking mouse cursor

Understand your audience

  • Capture listener activity with affinity scores
  • Measure your promotional campaigns and integrate with Google and Facebook analytics
Graph of increasing value

Engage your fanbase

  • Deliver timely Calls To Action, including email acquistion for your mailing list
  • Share exactly the right moment in an episode via text, email, and social media
Icon of cellphone with money

Make money

  • Tip and transfer funds directly to podcastsers
  • Earn money for qualified plays in the RadioPublic apps with Paid Listens