The Bioinformatics CRO Podcast
Episode 35 with Bharath Ramsundar
Bharath Ramsundar, founder and CEO of Deep Forest Sciences, described many applications of artificial intelligence in biotech, society, and the military.
On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.
You can listen onSpotify, Apple Podcasts, Amazon, and Pandora.
Bharath is the founder and CEO of Deep Forest Sciences, which builds AI for deep technology applications, and is the lead developer of the DeepChem open source project. He has founded multiple companies and authored 2 books.
Transcript of Episode 35: Bharath Ramsundar
Disclaimer: Transcripts may contain errors.
Grant Belgard: [00:00:00] Welcome to The Bioinformatics CRO Podcast. I’m Grant Belgard and joining me today is Bharath Ramsundar. Bharath is the founder and CEO at Deep Forest Sciences and a company that builds AI for deep tech. He’s also written two books, Deep Learning for the Life Sciences and TensorFlow for Deep Learning. Welcome on to the show.
Bharath Ramsundar: [00:00:17] Thank you for having me on. I’m excited to speak with you today.
Grant Belgard: [00:00:20] Can you tell us about Deep Forest?
Bharath Ramsundar: [00:00:22] Yeah, absolutely. So what we do at Deep Forest is to help deep tech companies, often in biotech, but also sometimes in other industries, build out their AI stacks. What this means in particular really depends on the organization in question. In some cases, it’s more say strategic understanding of what AI can do for them or not do for them. In other cases, it’s much more in depth, actually build out an AI sack. We rely a lot on open source tools. I am the lead developer of the DeepChem Project, which creates high quality, deep learning and AI tools for scientific applications. So we leverage the open powers of deep Camelot to help build out high quality solutions for our customers. And in the long run, I think we’re moving towards actually building out more of a product layer as opposed to being something that’s purely consulting based, but that’s still in the R&D phase. But we think we see the future heading.
Grant Belgard: [00:01:16] Can you tell us more about that? Because of course you hear a lot about architectures needing to be bespoke to the problem at hand and so on.
Bharath Ramsundar: [00:01:25] So on the product end, I can’t say too many details and in honest part because we’re now just at the taking prototypes to our friends and customers to get their feedback phase. But I can say something about how we see custom architectures versus out of the box architectures. So one of the strengths that DeepChem brings is that we have something like 30, maybe 40 architectures now that come out of the box, and each of them has many configurable hyper parameters. In my experience, and I might be quoting Andrew Ng from a earlier talk of his a number of years ago, going from no machine learning to any machine learning is say the first 80%. This is often accomplished by a simple statistical model like a random forest going from something like a random forest to maybe a deep model is that 90% boost. And then finally, I’d say the last 10% is where you go from a out of the box deep learning model to a fully customized deep learning model. With DeepChem given that there is just such a broad variety of models, like I think instead of 90%, maybe we get to the 95% fairly out of box. But yeah you do at the end if you really have a scaled out application, want to develop your own deep learning architectures. But for a lot of the customers we work with, they are sometimes even new to machine learning as a way of doing business. So a lot of the work we do frankly, sits at the 80% and then the 95% stage. It’s only a few customers who are already sophisticated that want to really go to that 100%, which we also do to design custom architectures.
Grant Belgard: [00:02:58] And for what kinds of problems is it often economically justifiable to put the resources into squeezing out that last 5 to 10%?
Bharath Ramsundar: [00:03:09] Cases where I’ve seen the last 5 to 10% be justifiable, it’s typically sticking to biotech. If you have a large assay, that’s pretty well optimized. And this is a foundational technology for your company. This can be pretty worth it. In this case, you’re often not say a startup that may be like a later stage startup, maybe early stage company. This is like a critical technology that will be a pillar of your company for the next 5 to 10 years. Yeah, I think it’s worth it to spend some time optimizing it. It can take considerable expense I think to put a ballpark number in people’s minds. I think market rate is probably a few hundred thousand dollars at least to make a custom deep learning architecture for an application. Something out of the box is of course considerably cheaper and the returns will be about 5%. If you have an optimized scaled out pipeline that could be like a steel, a few hundred thousand dollars, maybe that’s millions of dollars in return. But if that math doesn’t make sense for you and oftentimes for early stage companies, these numbers just don’t make sense. I’d say out of the box is your friend, unless you’re of course a deep learning expert yourself in which you can case you can roll something and then make that part of your core proprietary technology, which a number of companies do as well.
Grant Belgard: [00:04:24] So Deep Forest also has a substack where you write recently mostly about aviation and space exploration. Is that a special interest of yours and where do you see applications of AI in those industries?
Bharath Ramsundar: [00:04:40] The way the deep end of the Forest Substack works is that we do about 5 to 10 week tours of different industries. So we started publishing this year. Our first tour was on semiconductors. We did about ten weeks doing a deep dive into semiconductors. Our most recent ten week tour is in aviation. Typically, we move between various industries. We’ve done climate change. We’ve done energy. We will definitely do biotech in the not too distant future. So I’d say for us it’s more, we specialize in really building that deep market understanding of all these different industries. I think one of the powers that AI and Deep Tech really brings is that to quote carefully, some of the technology I’m working with for a customer in the energy battery space is not that dissimilar from technology that I’m using for a customer in the biotech space. A lot of the ideas carry over some of the deep learning architectures. Again, the out-of-the-box 95% like carries over if the custom stuff, of course it’s a different field entirely. But going with our understanding that most people really want up to that 95%, we think that there’s just considerable cross-disciplinary pollination in the aviation space for example.
[00:05:56] I think that CFD solvers for simulating these fluid dynamics and turbulence are, of course, a major mainstay. If you look a lot of federal grants, if you look at what Lockheed, Boeing pay for its fluid dynamics simulators, there’s been recently a surge in deep learning techniques in fluid dynamics. So Google recently had a paper on their new system. They call it JAX-CFD. So they’re using their new JAX Deep Learning framework to actually write a fluid dynamics solver that is machine learning optimizable. I believe that although these techniques are still very early days, it’s going to have a dramatic impact on the way aircraft is designed in the coming 5 to 10 years. And frankly for every industry we’ve done this in, I think this is the case. If you look at semiconductors, computational lithography is going to have a major impact. Google of course released their paper recently showing how they use reinforcement learning to design the next TPUv4 chip. So just given I think hand industry power of these technologies, we see commonalities in application and we see that not just in theory but in practice with real customers.
Grant Belgard: [00:07:06] Where do you think we are in the hype cycle for ML and do you think it varies by industry?
Bharath Ramsundar: [00:07:12] I think that AI for drug discovery I think is maybe even a little past the peak of the hype cycle. There was a lot of froth in that funding market for a while. It’s settling down a little bit, but still a lot of things getting funded. I’d say now AI for drug discovery is probably one of the more market mature applications. The first startups, I think probably pioneers like Atomwise or others got their start 2015, 2014, something like that. So it’s been around for a while. I think a lot of investors now have an understanding for what these companies can do and cannot do, which is that in some cases, if you look at companies like recursion, there actually have been very powerful exits for investors and I think real technology invented. But cancer isn’t cured yet, that to say the least. And as I’m sure that you all would have seen at the Bioinformatics CRO like drug discovery is a hard, hard problem. And we don’t expect AI to really ride in on a white horse and cure anything anytime soon. But I think we’re nearing understanding that these techniques are quite useful in practice, in moderation by a team that knows what they’re doing, which is again to quote that Gartner Hype cycle, maybe we are moving down towards the trough of disillusionment. But then over to the the steady state of useful application. Other industries I think we are much earlier. So things like fluid dynamics, things like high-performance computing, I think it’s just at the early days where people are beginning to realize, oh wait, these techniques are actually applicable to our work.
[00:08:39] So I think that it’s probably several years behind on that funding cycle and I anticipate more hype coming about. There’s also just been I think considerable advances in the technology of that field that are very recent. So I believe there is a recent paper, I think from a Niemann and Kumar’s group at NVIDIA on neural pediatrics, where they take these partial differential equation solvers and use essentially deep learning. The technical explanation, they work in the Fourier space and they make transformations there. But these could potentially speed up solutions of certain classes of differential equations which will just have broad applications in fluids, in energy, a variety of different use cases. So I think that’s probably a foundational paper that will only begin to see play out over the next five years. So yes, to long winded answer very much depends on the field. I think in drug discovery, maybe we’re in for a bit of disillusionment as people realize that techniques are very useful, but they’re not going to cure anything. Whereas in other fields I think there will be just, Oh wow, what if we could design a flying aircraft that’s hyper efficient? I probably hope I don’t say spoiler for everyone, but I don’t anticipate that type of revolutionary advance off the bat in any field. But I think that it is nearing a place of broad applicability where techniques are just useful to people in many industries.
Grant Belgard: [00:10:02] And how do you think about how companies can build that out? I think it’s fair to say there’s a pretty severe shortage of people with substantial experience in deep learning. I mean, obviously there are a lot more people who have some shallow understanding through MOOCs and things like this. But how do you think the landscape looks in terms of employment and training in that space?
Bharath Ramsundar: [00:10:27] I think that the educational tools have really gotten much better. It’s much easier for engineers to just pick up some basic machine learning just looking locally. And my family definitely had a few older engineers who got bored and picked up a Coursera course or two and now do some basic machine learning. These are say veterans of like IT industry later in their careers who are bored with their day jobs. So I think that there is a very positive effect where you will begin to see experienced people in all sorts of disciplines, just pick up some basic machine learning and realize that, hey, this isn’t that exotic. I will also say at the same time though, that in my experience there’s a steep curve not at the early theory stages, but in figuring out how to apply these things in practice. This means that your organization will need to do things like work out infrastructure for where do I do my compute? How do I run this on AWS? Where do I store the data? How do I version control models? How do I keep up to date with all the latest and greatest on the deep learning library infrastructure? While this is certainly possible I think for a medium sized company, it’s very hard I think to keep up with the speed at which the industry moves. With a package like DPM, we are fairly sizable open source package at this point, pretty active developer community. And still we struggle just like the fire hose at Google or Facebook or whatever puts out where they can just keep putting new the world’s best PhDs onto a problem. It just means that it’s very hard to stay abreast of the latest stuff in the fields.
Bharath Ramsundar: [00:12:01] So I think that there is considerable room for good software solutions to play a middle ground. I think teams will want to control their data absolutely. They will want to control their models. They want to not have to have someone who’s the middle man constantly holding their hand in the long run. I think to get off the ground though, having someone is invaluable. But I think what we see as the future is giving teams the ability to run more complex systems, but do so in a way that is as easy as it can be for them. So there’s definitely some solutions from the big players like AWS SageMaker, I think is the one that’s quoted a lot. Unfortunately in our experience, SageMaker is not yet ready really for custom applications. While it does say things out of the box the random forest equivalent, if you look at say just their logging capabilities or how you actually monitor a system on SageMaker, it’s not really at the place where I can recommend it for anyone to use. So there’s a new crop of companies that are like weights and biases, for example, that are offering new developer tools that are beginning to pick up some of this gap. So we think that that’s where we see the future of the broader AI developer market []. For deep tech, we think that there’s a lot of things about scientific projects that are just distinct enough from everyday data science applications that we think that there’s room for product in that space. And that’s the niche that we’re exploring right now.
Grant Belgard: [00:13:33] That’s very interesting. So are there specific areas of deep tech that you think are lagging far behind in terms of application of deep learning?
Bharath Ramsundar: [00:13:45] I would say that the answer is perhaps the opposite in that. I think drug discovery has been a standout for how fast it’s moved in adopting deep learning technologies. I would say that nearly every other field is far behind. The closest I’ve seen in second place is that material science has recently started to see a boom in more machine learning applications. There are some really cool projects, Matt Miner, The Pi-Match and community that have been leading this charge. But if you look at a lot of material science papers, I think that it’s just getting off the ground. Facebook recently launched the Open Catalyst Program to do machine learning on catalyst design. So I think you’ll begin to see a lot more members of that community really uptake these tools as it becomes clear that these techniques work. So that’s probably what I would say is the second runner field right now materials. But yeah if you move past that, it’s very early research right now. There’s a lot of interest. I think a lot of companies are interested in applying these tools. I know that if you look at companies that are doing things like designing cars, again very interested. I think people are very interested in using these reinforcement learning techniques that Google puts to use. But it’s challenging. If you look at reinforcement learning, it’s notoriously finicky. I think Rey has done a great job of making this easier for people, but I’d say it still requires an expert team that really understands what you’re doing.
Bharath Ramsundar: [00:15:11] So there’s this gap where there are things Google can do or OpenAI and then there’s things that everyone else can do, which even for a very solid academic team or frankly a company like ours, we can do things. But we can’t run them on, say 100 TPUs or whatever that Google can toss together. The other players, I’d say the Chinese ecosystem has been putting in piles of money. So Tencent, for example, I’d say is always say two steps behind Google, which is frankly probably five steps ahead of everyone else. So they are innovating in their own right also being fast followers. I will say for non-Chinese companies, there are major downsides so depending on the Chinese ecosystem. You can see the controversy with TikTok or Zoom, I would say, about data privacy and security. But they are frankly innovating and doing excellent work as well. For the rest of us I think we need to figure out how do we maintain cloud stacks, how do we actually scale out our learning infrastructure. And I don’t think there’s a turnkey solution for a new company even say one, I’m going to name a name Lockheed, for example, to waltz in and say I want a deep learning stack. I actually think that will take considerable investment in time. And I’m sure they’ve been doing this already for several years or trying.
Grant Belgard: [00:16:27] So speaking of China and Lockheed in the same segment, is it clear where the focus of applications are in China? Is this something that the military is playing a role in?
Bharath Ramsundar: [00:16:43] There is considerable aggression out of China right now. It’s like the 100th anniversary of the founding of the Communist Party I believe. China of course has comparatively done very well in the coronavirus pandemic, which has boosted its international profile. I would say there is considerable anxiety amongst the military about the capabilities of China. I’m not an expert at AI policy, but I would as a bystander say that in terms of AI advancement, I think the US is doing just fine. Like I see very good work coming out of the Chinese institutions. But I see better work coming out of Google or Facebook or DeepMind, where I think the Chinese ecosystem is ahead is that if you look at their physical manufacturing capabilities, I just read a report this morning about the US Navy is overbooked and a little bit under budget, whereas the PRC Navy has been dramatically accelerating its shipbuilding, which we have a deep into the forest piece about explaining CSSC is the China State Shipbuilding Corporation is the world’s biggest ship maker. And they use the same docks to make aircraft carriers as they do commercial oil tankers. So were I a military planner? And I know there are military planners worrying about this. I would worry about the Navy. I’d worry about the physical hardware. I think in terms of intelligence, I think the US is fine. Like AI, Google and others are doing more than fine right now.
Grant Belgard: [00:18:10] And where is Europe in all this?
Bharath Ramsundar: [00:18:12] Well, that’s an excellent question. I think that Europe has been putting a lot of money into getting their AI ecosystem off the ground. I see a lot of DeepMind of course, but there’s also I think increasingly a number of sophisticated European AI companies. And there are some neat companies doing things in the hardware space. ASML of course has continued to innovate and do excellent work. So I’d say Europe has a fine ecosystem, but in many ways the European ecosystem is you could say a mirror of the American ecosystem in some ways, but there’s a bigger focus on consumer privacy and data protection, perhaps a tad more regulation which is I’d say better for consumers, but maybe a little harder for businesses. So I don’t think Europe is doing badly at all. But I think the rising juggernaut of course is China and maybe a couple steps behind the Indian ecosystem, which has had a lot of innovations I think at the app layer. But not say quite to the same degree as the Chinese ecosystem right now.
Grant Belgard: [00:19:13] Are there any other regions doing notable work in this space?
Bharath Ramsundar: [00:19:17] I think that there’s a lot of interest in Africa. I think there’s been some great deep learning conferences. I think there’s a lot of really talented students who are starting to build out community there. I think the Nigerian tech ecosystem is also booming, for example. So there are definitely innovators in all these spaces. I know less about the South American ecosystem, so I won’t comment there. I know there’s a couple of cool companies out of Brazil. Yeah. So I think that’s probably the other player l like Australia has of course been doing lots of cool stuff in different areas. I think with Australia a big focus right now is there is this ongoing trade war with China where there’s a very challenging situation that they’re facing. So I think the geopolitical side in the Pacific is unfortunately bifurcating everyone where you’re with the US or you’re with China. I think the unfortunate way that things are shaped down there is probably going to be a lot of competition in that entire part of the world in the next coming decades. You asked about our newsletter so I think part of the reason we do these analyses is that I think geopolitics and AI and deep tech are just intricately tied. For example, companies working on better ships, I think will have a very bright future 5 to 10 years ahead when the Navy realizes as it’s realizing right now that oh crap, we have a problem on our hands. There are many people who have been doing these analyses, but what we try to do is look at the broad picture across industries and tie together these thoughts of what we see in different spaces, where there’s actually an underlying theme.
Grant Belgard: [00:20:46] What impact has COVID 19 had on uptake of AI, if any? I mean, do you think it’s had an impact at all?
Bharath Ramsundar: [00:20:56] I will say that it’s broadened the geographic scope to some degree with DeepChem for example, we were a project that grew out of Stanford, my PhD thesis work. The early community was people who showed up at events we put on for the local Stanford community. We put out a pizza. Typically, a friendly company would rent out their space. We’d have a couple of talks, people would mingle. So understandably, our early contributors came from Palo Alto or thereabouts. But increasingly now I think the community is very, very global. So we had some calls this morning, people from Switzerland, people from India, people from Japan. California, of course, remains pulled out. But I think there’s considerable geographic widening where more and more, I would say the AI community is distributed. All the work happens on the cloud anyways. It doesn’t really matter where you are that much. So I think that COVID has accelerated a trend that was already happening by removing the physical necessity of being in one spot. Yeah, this is something that will likely be around to stay. At the same time, I think if you’re an entrepreneur, I think there’s an advantage to just being hanging out in San Francisco, which even today is considerable. So I think both these things are simultaneously true. So I anticipate you’ll probably have a whole bunch of companies where the founders come, set up shop and have staff with the engineers or team or wherever in the world. So then you get the best of both worlds. You get talent globally, but founders locally. And that’s a choice we’ve made. So I’m kind of based in the Bay Area, but the Deep Forest Sciences Team is many places.
Grant Belgard: [00:22:33] Can you tell us more about DeepChem and how it came about?
Bharath Ramsundar: [00:22:38] Yeah, absolutely. So a number of years ago I had the good fortune to intern at Google at their Accelerated Sciences team. So we did some cool work. We trained some deep models that were for the time, I think very cool and very large. I had an excellent internship there. But as with all good things, the internship ended, had to head back to grad school and I realized, Oh crap, my best results are at Google and I can’t replicate any of this. So I set about trying to replicate them. And at the time Francois Chollet had just put out Keras, which was just an amazing tool. So the original version of DeepChem was adaptation of Keras to training multitask networks, which is what we’d built on chemical data. And I wanted to share with some friends down the hall, so put it up on GitHub and made it an open repo and things just grew since. I think the code base has been written considerably multiple times over the last few years. I think our first use case that drew in a lot of people was we had some good implementations of graph convolutions contributed by some engineers who got involved with the project early on.
Bharath Ramsundar: [00:23:42] So that drew in a lot of people into the community. But increasingly today I think that for DeepChem, we are evolving into a AI for science framework. So we continue to have I’d say the best open source suite of machine learning for chemistry tools, I would argue. But increasingly we have very powerful capabilities in materials science, protein design, early work in other fields, been tinkering with some hopes for getting some fluid support off the ground. So where we see the future of DeepChem of going is to really make it easy to apply AI for scientific applications. Deep Forest Sciences of course we support this extensively because, I’m the same person and a lot of the same core thesis drive this. And I think the model we follow is that open core is a just powerful base for any company because you can have technology that’s vetted by scientists and experts across the world and you just get corner cases filled, you get bug reports figured out, you get people contributing their time because it’s open and they can also benefit from it that you don’t get otherwise.
Grant Belgard: [00:24:47] So you also co-founded Computable, right?
Bharath Ramsundar: [00:24:49] Yeah, absolutely. Computable core technology was building out what I would call these data co-ops. The idea behind a data co-op is that if you have a group of people who are gathering a data set, they deserve a right to having some equity in that data set. The motivating example we started from was these genomic patient datasets. If you are a rare disease say patient group and you contribute your genomic data in the quest for a cure, at the least you should be getting some royalties from that. Like maybe not even for yourselves, but so you can continue funding research into your rare disease. So our technology that we built out was to build a system to track ownership of a data set and to parcel out royalties to the owners of a dataset whenever it was used. So we built a system for this on Ethereum before of course this most recent giant crypto boom. But I think it was just a case of cool technology but wrong timing. And we found that the blockchain was just too onerous to use. We had major UI issues. Customers liked the concept, but when they figured out that they had to click seven times to do anything because of complicated back and forth permission granting to Ethereum and it took several minutes for each transaction to go through, it just didn’t get off the ground. So a setback from that team a couple of years back, the team since rebranded, trying a few other experiments. But I think it was a really cool idea we had. Just I think that would be a great project to try again, but say five years from now, once these technologies have matured.
Grant Belgard: [00:26:21] So can you tell us in a nutshell about your path? You grew up in the Bay Area and let you take it from there.
Bharath Ramsundar: [00:26:30] So grew up in the Bay Area. Your first job out of college worked as a engineer at a company called Fusion-io. We used to make these non-volatile flash devices that we sell to Facebook and Apple for quite a markup. This was a proprietary software stack that was very efficient. The company was later bought out by SanDisk. Unfortunately, the commoditization of that hardware market just removed the margins that made that company really, really work out. So I spent about a year there, then left to go to grad school at Stanford. At the time, deep learning was very hot. Got into that, did a lot of coursework, had the good fortune to work with some collaborators who knew more about the chemistry drug discovery side than I did. So learned some of that, did this project with Google and then from there started the DeepChem Open source project. After the PhD, I was more entrepreneurially minded than academically, so I decided to try co-founding startup. At the time, this was in 2017, there was a big crypto boom so we got caught up in the craze and tried building cool things with crypto.
Bharath Ramsundar: [00:27:36] But as I just mentioned that the technology was not ready for I think the applications that we wanted to do. So after stepping back from that company, I decided to take some time off. So I consulted a little bit with a few friends. But the patterns that I’ve been talking about with Deep Forest Sciences became apparent and near the end of last year decided to actually say, okay, there’s something here. Let’s make this an actual company. So that’s what I’m working on full time now. And you’re working to grow out Deep Forest Sciences. And of course, this entire time in the community has been steadily growing and we’ve been maturing the code base and expanding that. At this time, I think it’s something like 70,000 lines of code, and that’s getting to I think a pretty sophisticated numerical scientific infrastructure. We have a long ways to go before really science is so vast. There are so many places software can make a difference I think. So it’s probably another 5, 6, 10, 15 years for this tool to really mature.
Grant Belgard: [00:28:34] Do you think at some point DeepChem may need to be renamed?
Bharath Ramsundar: [00:28:39] We’ve definitely had some discussion about that. Yeah, probably. But I figure let’s build the infrastructure and the communities first and at some point the name will work itself out. But yeah, it’s entirely possible that we need a better name. But for now, I think everyone in the project understands it’s broader, just chemistry. It is something we have to tell newcomers as they come in that, Hey, I know we’re named for chemistry, but we do do that, but we also do other things. So make sure we don’t miss out on that eventually.
Grant Belgard: [00:29:08] What do you expect will be the most visible, successful applications of AI over the next decade?
Bharath Ramsundar: [00:29:17] OpenAI I think is likely to use its GPT-3 technology for some high profile applications. So I think they put out this very recently copilot or something like that where they’re using these GPT-3 technologies to help aid code autocomplete essentially. I think this is going to be a very radically powerful technology. We’ve seen the way software is developed now is very different from the way it used to be. So we all have our continuous integration systems. We have all these automated controls on a software people. It’s what enables a small team DeepChems to maintain a probably 10 or 15 years ago to require a large company to maintain in terms of code sophistication. I think these trends will only accelerate. It’s going to be a case where one, it does become easier for everyone to develop software using AI techniques. But I also think it’s going to be the case that the people who have the best understanding and control of these methods will get richer style, make the most use of it. So I think that this is going to have a dramatic impact on the developer market right now. One thing I see is that if you look at say something like front end developers, if you go back 20 years, web developers were very limited market supply. But as you know, the growth in this technologies has dramatically widened as you’ve had code bootcamps and even increasingly now, I think autocomplete style tools is the next generation of this.
[00:30:49] You’ll start to see a bit of the pricing on that market fall down. So my guess is that what will happen is that you’ll be able to have say one senior developer who can corral AI tools to do more, which might undercut the market for some of these bootcamps over time. I think that something similar is likely to happen in basic data science as well. There has been a lot of basic data science bootcamps. A lot of kids are learning these skills in college. My 2 cents is learning math and stats is something that you can’t go too wrong with and that it’s just such a powerful way of thinking about the world. So I’m not too worried about college students who picked up some extra stats not making use of those skills. But I do think that for everyone in the tech ecosystem, I think we’ll have to continually revamp our toolchains and our understanding of these technologies in order to stay ahead of probably the increasingly powerful AI co-pilots that are coming our way. One kind of example a little bit, this is a field in pure mathematics so Peter Schultz, who’s one of the world’s most foremost mathematicians, recently put out this blog post where there is a conjecture that he’d strongly suspected was true. But had not been able to entirely verify to his liking. But he was able to use Lean, which is a new proof assistant/dependent type programming language from Microsoft Research to formulate a proof that it actually was correct. And he was very surprised that these tools were at the point where it actually could aid in the work of cutting edge mathematics and not just for something you do after to reprove old theorems.
Bharath Ramsundar: [00:32:26] So I think that these trends will only accelerate. I think software will cannibalize software far before it succeeds in cannibalizing. Plumbers for example, I think can probably sit safe. I don’t anticipate plumbing robots coming about for the next several decades. So weirdly, I think you will see a case where plumbers mechanics I think will sit tight knowing that their jobs are likely to be in high demand. Whereas like I think for developers, I think there will be a bit of a struggle where you need to upskill yourself in order to be competitive in this market. And that means probably just learning more mathematics and learning more system software at the high end system scale and at the sophisticated mathematics. I don’t think those skills will go away. But if you only know how to do basic HTML, you might be in for a tough time. So that’s maybe the biggest shift I see in that AI is going to move on the software industry and that’s where I think will have its biggest impacts, which will be I think insulated from the world more broadly. But for those of us who work in the industry, I think we better crack open those textbooks and start learning some more things. Otherwise we’ll be replaced.
Grant Belgard: [00:33:29] What do you think is the best way to go about that for people already in the workforce?
Bharath Ramsundar: [00:33:33] That is an excellent question. I think that MOOCs are really amazing. I think just starting from a MOOC, starting from YouTube, Wikipedia even can be surprisingly useful and doing lots of side projects. As a developer, I think that you have 9 to 5, you have things that you have to do for your work. Like for me for example, the 9 to 5 is of course like this AI deep tech, but I have persistent interests in compilers. That’s something that I toyed with on the side and nothing to really show for it. But I think that these experiments I find often come in and really change the way I do my day job for the better. Having a curiosity and willingness to do these toy projects I didn’t know any react. So I spent some time hacking together, a very simple React app. It was terrible. A good frontend developer could do much better, but I think it broadened my understanding. So I think for all of us, just given how vast computer science is making something that you don’t know much about, maybe like a simple graphics rendering system and doesn’t have to be professional grade. It just has to be a weekend or two of hacking. I actually think that’s an excellent way of keeping our skills. I think our greatest advantage is our flexibility or if you needed to get a good Java developer to code and C++, you could probably figure that out given a couple of months, whereas I think that’s going to be harder for like an AI system to do necessarily. So I think flexibility is where we have a persistent advantage over the machines.
Grant Belgard: [00:35:01] Well, I wonder what do you think of an alternative hypothesis that basically these tools will be developed to aid and supercharge developers. And maybe it’s not so much a matter of having developers put out of work as just having dramatically more software development around the world. I mean, how likely do you think that outcome is?
Bharath Ramsundar: [00:35:23] That’s a really, really good question. I don’t know for sure at all. I think this is such a complicated question that all of these could come about. I suspect that if you look at say web development as an example. So I think that in one way, the number of web developers has skyrocketed. People use Squarespace or similar tools all over the world to make new websites. And you could argue you are a web developer if you’ve built a website on Squarespace. At the same time, the skilled web development market I think has gone in two directions. You have the very high end folks who are building these ultra sophisticated stacks. Yeah, of course they’re around. They are probably getting paid multiples of what they were getting paid before. But I think the city basic website together market has vanished. And it’s so easy with no code or do it yourself tools now that as a new entrepreneur together Deep Forest Sciences basic website on Squarespace and we have a new website that we’re actually building as a custom app. But the Squarespace is good enough for us for getting something up. Similarly, I think for a lot of development applications, we’ll see a similar bifurcation.
Bharath Ramsundar: [00:36:35] I think the current market will be automated out. You’ll have the up skills market where, yeah, it’s going to be very, very lucrative. In one sense, the number of AI developers I think is going to just dramatically increase. But if in the future you say, Hey Siri, can you put together a basic website for me and Siri does it, are you an AI developer? Yeah, in some sense you are. You’ve figured out how to interface with an AI and make it do useful things for you. But that market of the person who would do that right now, either they’ve turned into the Siri developer themselves so to speak, and they’ve upskilled themselves in another option. They’ve decided let’s just use my skills to build out a business, which I think a lot of technical folks increasingly do. I can see both these, but again I should put an asterisk on on these claims in that. I only have one limited view into this very, very broad ecosystem. So I am as ignorant as anyone in terms of who knows what all the dynamics that are happening. It’s such a complicated industry. There are so many things going on, so we’ll have to wait and see.
Grant Belgard: [00:37:37] I think it’ll be a really interesting decade.
Bharath Ramsundar: [00:37:39] Yeah, computer science, I think it’s one of these fields just reinvents itself like all the time. So I think that this is probably a very natural part of the process that’s been going on since the 50s.
Grant Belgard: [00:37:48] Thank you so much for joining us. It has been fun.
Bharath Ramsundar: [00:37:48] Thank you for having me on.