(16:28) Towards the end of his Ph.D., Chris started getting involved with the Apache Software Foundation. More specifically, he developed the original proposal and plan for Apache Tika (a content detection and analysis toolkit) in collaboration with Jérôme Charron to extract data in the Panama Papers, exposing how wealthy individuals exploited offshore tax regimes.
(36:31) Chris also served as the Director of the USC’s Information Retrieval and Data Science group, whose mission is to research and develop new methodology and open source software to analyze, ingest, process, and manage Big Data and turn it into information.
(41:07) Chris unpacked the evolution of his career at NASA JPL: Member of Technical Staff -> Senior Software Architect -> Principal Data Scientist -> Deputy Chief Technology and Innovation Officer -> Division Manager for the AI, Analytics, and Innovation team.
(44:32) Chris dove deep into MEMEX — a JPL’s project that aims to develop software that advances online search capabilities to the deep web, the dark web, and nontraditional content.
(48:03) Chris briefly touched on XDATA — a JPL’s research effort to develop new computational techniques and open-source software tools to process and analyze big data.
(52:23) Chris described his work on the Object-Oriented Data Technology platform, an open-source data management system originally developed by NASA JPL and then donated to the Apache Software Foundation.
(55:22) Chris shared the scientific challenges and engineering requirements associated with developing the next generation of reusable science data processing systems for NASA’s Orbiting Carbon Observatory space mission and the Soil Moisture Active Passive earth science mission.
(01:04:24) Chris quantified the Apache Software Foundation's impact on the software industry in the past decade and discussed trends in open-source software development.
(01:07:15) Chris unpacked his 2013 Nature article called “A vision for data science” — in which he argued that four advancements are necessary to get the best out of big data: algorithm integration, development and stewardship, diverse data formats, and people power.
(01:11:54) Chris revealed the challenges of writing the second edition of “Machine Learning with TensorFlow,” a technical book with Manning that teaches the foundational concepts of machine learning and the TensorFlow library's usage to build powerful models rapidly.
(01:15:04) Chris mentioned the differences between working in academia and industry.
(01:16:20) Chris described the tech and data community in the greater Los Angeles area.