Cover art for podcast Datacast


60 EpisodesProduced by James LeWebsite

Datacast follows the narrative journey of data practitioners and researchers to unpack the career lessons they learned along the way. James Le hosts the show.


Episode 54: Information Retrieval Research, Data Science For Space Missions, and Open-Source Software with Chris Mattmann

  • (2:55) Chris went over his experience studying Computer Science at the University of Southern California for undergraduate in the late 90s.
  • (5:26) Chris recalled working as a Software Engineer at NASA Jet Propulsion Lab in his sophomore year at USC.
  • (9:54) Chris continued his education at USC with an M.S. and then a Ph.D. in Computer Science. Under the guidance of Dr. Nenad Medvidović, his Ph.D. thesis is called “Software Connectors For Highly-Distributed And Voluminous Data-Intensive Systems.” He proposed DISCO, a software architecture-based systematic framework for selecting software connectors based on eight key dimensions of data distribution.
  • (16:28) Towards the end of his Ph.D., Chris started getting involved with the Apache Software Foundation. More specifically, he developed the original proposal and plan for Apache Tika (a content detection and analysis toolkit) in collaboration with Jérôme Charron to extract data in the Panama Papers, exposing how wealthy individuals exploited offshore tax regimes.
  • (24:58) Chris discussed his process of writing “Tika In Action,” which he co-authored with Jukka Zitting in 2011.
  • (27:01) Since 2007, Chris has been a professor in the Department of Computer Science at USC Viterbi School of Engineering. He went over the principles covered in his course titled “Software Architectures.”
  • (29:49) Chris touched on the core concepts and practical exercises that students could gain from his course “Information Retrieval and Web Search Engines.”
  • (32:10) Chris continued with his advanced course called “Content Detection and Analysis for Big Data” in recent years (check out this USC article).
  • (36:31) Chris also served as the Director of the USC’s Information Retrieval and Data Science group, whose mission is to research and develop new methodology and open source software to analyze, ingest, process, and manage Big Data and turn it into information.
  • (41:07) Chris unpacked the evolution of his career at NASA JPL: Member of Technical Staff -> Senior Software Architect -> Principal Data Scientist -> Deputy Chief Technology and Innovation Officer -> Division Manager for the AI, Analytics, and Innovation team.
  • (44:32) Chris dove deep into MEMEX — a JPL’s project that aims to develop software that advances online search capabilities to the deep web, the dark web, and nontraditional content.
  • (48:03) Chris briefly touched on XDATA — a JPL’s research effort to develop new computational techniques and open-source software tools to process and analyze big data.
  • (52:23) Chris described his work on the Object-Oriented Data Technology platform, an open-source data management system originally developed by NASA JPL and then donated to the Apache Software Foundation.
  • (55:22) Chris shared the scientific challenges and engineering requirements associated with developing the next generation of reusable science data processing systems for NASA’s Orbiting Carbon Observatory space mission and the Soil Moisture Active Passive earth science mission.
  • (01:01:05) Chris talked about his work on NASA’s Machine Learning-based Analytics for Autonomous Rover Systems — which consists of two novel capabilities for future Mars rovers (Drive-By Science and Energy-Optimal Autonomous Navigation).
  • (01:04:24) Chris quantified the Apache Software Foundation's impact on the software industry in the past decade and discussed trends in open-source software development.
  • (01:07:15) Chris unpacked his 2013 Nature article called “A vision for data science” — in which he argued that four advancements are necessary to get the best out of big data: algorithm integration, development and stewardship, diverse data formats, and people power.
  • (01:11:54) Chris revealed the challenges of writing the second edition of “Machine Learning with TensorFlow,” a technical book with Manning that teaches the foundational concepts of machine learning and the TensorFlow library's usage to build powerful models rapidly.
  • (01:15:04) Chris mentioned the differences between working in academia and industry.
  • (01:16:20) Chris described the tech and data community in the greater Los Angeles area.
  • (01:18:30) Closing segment.
His Contact InfoHis Recommended Resources

Educational emoji reaction


Interesting emoji reaction


Funny emoji reaction


Agree emoji reaction


Love emoji reaction


Wow emoji reaction


Listen to Datacast


A free podcast app for iPhone and Android

  • User-created playlists and collections
  • Download episodes while on WiFi to listen without using mobile data
  • Stream podcast episodes without waiting for a download
  • Queue episodes to create a personal continuous playlist
RadioPublic on iOS and Android
Or by RSS
RSS feed

Connect with listeners

Podcasters use the RadioPublic listener relationship platform to build lasting connections with fans

Yes, let's begin connecting
Browser window

Find new listeners

  • A dedicated website for your podcast
  • Web embed players designed to convert visitors to listeners in the RadioPublic apps for iPhone and Android
Clicking mouse cursor

Understand your audience

  • Capture listener activity with affinity scores
  • Measure your promotional campaigns and integrate with Google and Facebook analytics
Graph of increasing value

Engage your fanbase

  • Deliver timely Calls To Action, including email acquistion for your mailing list
  • Share exactly the right moment in an episode via text, email, and social media
Icon of cellphone with money

Make money

  • Tip and transfer funds directly to podcastsers
  • Earn money for qualified plays in the RadioPublic apps with Paid Listens