Season 1 – Episode 19 – Accelerating Antibiotic Discovery Using AI
Discusses the use of artificial intelligence (AI) to accelerate antibiotic discovery.
Episode Transcript
Click to expand/collapse
Daniel Smith: Welcome to On Tech Ethics with CITI Program. Our guest today is César de la Fuente, who is a presidential assistant professor at the University of Pennsylvania. César’s research focuses on using computational approaches to accelerate discoveries in biology and medicine. Specifically, he pioneered the development of the first computer designed antibiotic with efficacy in animal models, demonstrating the application of AI for antibiotic discovery and helping launch this emerging field. Today, we’re going to discuss the use of AI for antibiotic discovery.
Before we get started, I want to quickly note that this podcast is for educational purposes only. It is not designed to provide legal advice or legal guidance. You should consult with your organization’s attorneys if you have questions or concerns about the relevant laws and regulations that may be discussed in this podcast. In addition, the views expressed in this podcast are solely those of our guest.
On that note, welcome to the podcast, César.
César de la Fuente: It’s great to be here.
Daniel Smith: It’s a pleasure to have you. I gave you a very brief introduction, so can you share some more about yourself and what you currently focus on at the University of Pennsylvania?
César de la Fuente: Yes, absolutely. I guess the initiative of the lab is to really harness the power of machines and use that to accelerate discoveries in biology and medicine. I’m really lucky to be able to do that with team members that come from different places around the world, but also have different backgrounds and different ways of thinking.
Just to give you an example, right now in the lab, we have computer scientists, chemists, synthetic biologists and microbiologists all working together to try to accelerate antibiotic discovery by using a combination of AI methods, and also a lot of experimental evaluation. That’s what we’re aiming to do.
Daniel Smith: I’m excited to learn more about how AI can be used to accelerate antibiotic discovery. To get started, can you talk a bit about some of the current challenges with antibiotic discovery and development?
César de la Fuente: Absolutely. Antibiotic resistance is a huge global health problem. Just to give you a sense of the magnitude, it currently kills over one million people every single year in the world. These are bacterial infections that are untreatable with conventional antibiotics. It gets worse actually, because the current projection is that by the year 2050, 10 million people will die as a consequence of bacterial infections, which is predicted to surpass every other major cause of death in our society even including cancer. We’re facing a silent pandemic, where a lot of these emerging pathogens are becoming increasingly resistant to all the antibiotics that we have available in pharmacies and hospitals. More and more, we see patients coming into the clinic with infections that are really untreatable, even with combination therapies. It’s a huge concern and problem that effects every corner of the world, and we’re very much motivated by this.
One of the things that is quite surprising is that actually, I would say in a lot of resistant infections, it’s one of the most under invested problems in the world that kills the most people. If you look at statistics and numbers from big pharma and companies, they’ve over the years, reduced or completely sold or stopped investment in antibiotic discovery. Right now in the world, in 2024, it’s up to a couple of academics, crazy academics, in different institutions around the world to try to think outside of the box. Think about how we can push the boundaries on antibiotic discovery, how we can think about the problem in a different way, how we can develop novel approaches that are different from traditional methods to try to accelerate this and try to really come up with truly novel antibiotics that can eventually, hopefully, save lives.
Daniel Smith: Now, I want to hear more about some of those novel approaches and specifically how AI can help overcome some of those challenges. But first, I think it would be helpful to briefly go over how AI is used in antibiotic discovery. Can you provide us with an overview of how AI is used in this context?
César de la Fuente: Absolutely. Maybe I can give you some sort of historical context, and maybe threading in my own personal experience in this field. Let’s go back.
When I was in my PhD at the University of British Columbia, when I went there, I was really interested in understanding biology from first principles, so I dedicated my PhD to learning about the simplest living organisms that exist, which are these wonderful creatures called bacteria. I learned all sorts of things about bacteria become harmful to humans and the mechanisms that they utilize to do that. Then towards the end of my PhD, I starting working with antibiotics, trying to engineer small proteins to see if they could be used as a new class of antibiotics.
But one of the things I realized is that biology is obviously very complex, chaotic, playful and extremely difficult to engineer. Often times, I would find myself making a mutation on the molecules that I was trying to develop as a new antibiotic, and then having to go back to the drawing board. Often times, things wouldn’t work. I realized that of course, biology was not programmable. I was a bit dissatisfied by this need for endless trying another experimentation that was quite painstaking.
Then I had this epiphany that, with advances in compute power, I thought that AI could actually help revolutionize biology, and more specifically maybe help revolutionize the invisible biology, which is microbiology and also the field of antibiotics. Then what I did is I had this opportunity, I was recruited at MIT, which back then was a mecca for AI. There were a lot of people working on AI, but in different domains. Domains that were unrelated to biology. Things like pattern recognition, systems for speech and image recognition. At the time, the general consensus is that computers and AI could not really be used in biology because biology was just too complex for computers to have any sort of useful input.
But nevertheless, I was able to observe a lot of my information from other fantastic researchers working at MIT and applying AI to other systems. Along with our great collaborators, we decided to ask a very fundamental question, which was can a computer be used to create an antibiotic? Again, this is the origins of this field, so we had to just start by asking really critical questions and really simple questions in retrospect. We thought about this for a long time. For example, one of the critical questions that we asked ourselves was how can we teach a computer to innovate at the molecular level, to create something new? To create diversity at the molecular level.
After much thinking and brainstorming, we came up with this idea to just simply mimic the greatest engine that we have in the world to create something new, something novel, something diverse at any level. If you think about it, that’s evolution itself, the evolutionary process. What we decided to do is we decided to train a computer to execute Darwin’s algorithm of evolution. But the really cool thing is that, instead of having to wait for millions of years for a molecule to evolve through the natural evolutionary process, on the computer, we can compress that timescale tremendously and we can do that in a matter of hours.
That’s what we did. We trained an algorithm that was capable of taking natural molecular sequences from nature, and was able to simply apply evolution. Of course, the primary steps of evolution are mutation, selection and recombination, and we built a feedback loop in order to create that process in a continuous fashion. The computer was able to create new antibiotics, entirely new antibiotics, and then it basically gave us a number of molecules that it predicted that would be great antimicrobials.
But of course, this project was started around 2015 or so and it got published in 2018. Back then, we couldn’t just rely on the computer’s assumptions. We couldn’t just trust that the computer was correct. We needed to experimentally evaluate to see whether the computer was correct or incorrect. What we did is we synthesized chemically, we made a lot of these molecules that the computer created in the lab. We tested them against bacteria, that are clinically relevant bacteria that are problematic in our society. We were able to find, out of those screening efforts experimentally, particularly one molecule that was really good at killing bacteria. We called this molecule guavanin. This was really a computer created molecule that now was able to not only look pretty on the computer screen, because it has this beautiful [inaudible 00:09:25] structure, but it was actually capable of killing clinically relevant pathogens in the lab.
Then we wanted to learn how this guavanin molecule was able to kill bacteria. We performed a critical experiment, and what we found is that it killed bacteria in a completely different way compared to most antibiotics of the same type. Basically it targeted the membrane in the reverse way compared to most antibiotics. This was really puzzling to us because that was in no way incorporated or written into the computer program. Looking back at that experiment, I think it really represents one of the first examples in biology of the emergent properties of AI. These were properties that were not incorporated, not written into the fitness function by human researches, and then realizing that the AI itself can lead to novel properties that were unanticipated. Of course, we’ve seen that now in many other domains, the ability of AI systems to come up with new things, but perhaps that was one of the early examples in a field as complex and chaotic as biology.
Then the final experiment that I think is critical to highlight is that we were able to see that guavanin, this computer made molecule, was capable of reducing infections in a particular mouse model. That was really an experiment that convinced us that this would be a new field. That you could create something in the computer that would not only kill bacteria in vitro, in Petri dishes, but also in mouse models. Since then, this field has been growing. We published that paper in 2018, so in the last half a decade or so, this field has run to a more mature field where … A hugely [inaudible 00:11:11] in our field, by the way. It brings researches from computer science, from microbiology, from sim bio, all together to try to tackle this huge problem with antibiotic resistance.
Daniel Smith: That’s really fascinating. Just to back up for a moment, I’m curious to hear about when you did that initial training of the algorithm. What types of data did you use?
César de la Fuente: In that use, we used actually available data of natural molecules from nature. Back then, you’d have huge training sets. Right now, the latest projects that we’ve been conducting in my lab, we’ve been using more training sets and if you want, I can tell you a little bit more about some of those projects.
Daniel Smith: Absolutely. That would be great.
César de la Fuente: Great. That was work that we conducted at MIT, and then I was recruited at the University of Pennsylvania. Here, the initial question that we asked ourselves was can we use computers to accelerate the modern discovery? This was motivated by the fact that it takes a long time to develop a new drug, including a new antibiotic, so from the moment that a molecule is discovered in the lab to the time that it actually has an impact on patients, it takes over a decade on average. It’s a long, windy road to get to have an impact on people. It’s also a hugely expensive endeavor. On average, it costs over $2 billion to develop an antibiotic. Just to put this in perspective, this is more than the budget that NASA or SpaceX have to take a rocket to outer space or to the moon. It’s an incredibly expensive and slow process. We hypothesized that maybe AI could help accelerate this, thereby reducing the time and the cost associated with the discovery of antibiotics. That was the initial premise.
We were asking ourselves, how can we go about this? Through some of the technologies that I had learned back at MIT, the pattern recognition algorithms that I mentioned earlier, we decided to apply those algorithms for antibiotic discovery. But of course, instead of recognizing facial expressions or sounds, we wanted to recognize molecular patterns, code, biological code to see if we could find potential antibiotic molecules in biology. By biology, I mean things like genomes, and proteomes. Genomes being all the genes that are expressed in a particular organism, and proteomes, all the proteins that are expressed and coded in those genes.
What we did is we started with very simple pattern recognition algorithms that could mine individual proteins to see if we could find antibiotic molecules included in those proteins. This was in collaboration with colleagues over in Italy. We performed this work in a number of projects, mining individual proteins. But then, we realized that with advances in compute and algorithmic power, we could scale up from mining individual proteins to mining entire proteomes. These are all the proteins encoded in a genome. We decided to embark on this very exciting journey of, for the first time, mining the human proteome as a source of antibiotics.
When I talk about the human proteome, when you think of 20,000 proteins that are encoded in our genome, but in this case we also took into account isoforms. In the end, we sampled with the algorithm over 40,000 proteins. Which if you do a calculation, it corresponds to around 100 million peptides. This was obviously a huge endeavor at the time. We thought that the algorithm would take a while to process through all this data. To our surprise, within an hour of the onset of the project, the algorithm stopped and it finalized the task. Initially, this was very surprising. We thought there had been maybe an error in the algorithm. We went back to the code, there was no bug. We simply realized that the algorithm was actually a pretty simple scoring function that was very computationally inexpensive. It was extremely rapid, so in this one hour it was able to sample the whole human proteome.
What this work led to was the discovery of thousands of new what we call encrypted molecules, encrypted peptides that we encoded in the human body that were previously unrecognized to play a role as antibiotics. This is dark matter that was previously unknown to have these antibiotic properties. This really opened up an entire new field of mining dark matter to try to find potentially useful molecules. Then we synthesized a lot of them and we validated them, and many of them had actually the effective properties in [inaudible 00:16:03] and infection as well.
Daniel Smith: I want to take a quick break to tell you about CITI Program’s Bioethics course, which provides an overview of bioethics concepts, principles and issues. For example, it addresses clinical ethics, genetics, justice in healthcare and more. You can learn more about this course at citiprogram.org. Now, back to the conversation with César.
From your perspective, what do you think these findings may mean for the future of antibiotic discovery and development?
César de la Fuente: If I look back, I think a lot of this AI work and algorithmic work, what is has enabled, and I can talk about subsequent projects that we followed up on, it has enabled us to accelerate discovery in antibiotics. We think about traditional methods, it takes between three and six years to discover clinical candidates that might be exciting to take to phase one, and phase two, and phase three clinical trials. Now with AI, we can do that in a matter of hours, but we can actually discover hundreds of thousands for clinical candidates. We basically compressed the time that it takes to discover potential antimicrobials from years to a matter of hours.
I think that’s really one of the takeaways that, if I reflect back on the work that we’ve been doing, it’s one of the most surprising things. If you had asked maybe five, six years ago, I would have probably said that’s close to impossible. But with advances in compute and our ability to develop better and better algorithms to mine entire genomes and entire proteomes, really mining things for biology as a source of molecules and potential medicines, this has been really transformative.
Daniel Smith: Absolutely. Just to shift gears for a moment, when it comes to these ongoing efforts in antibiotic discovery, what are some of the unique ethical and legal issues that you and your colleagues are considering as you do this work?
César de la Fuente: Yeah, that’s a great question and I’ll tell you a little bit about another project that really ties in well with bioethical and other potential ramifications. It’s a follow on work from the exploration of the human proteome. When we explore the human proteome as a source of antibiotics, we found like I mentioned, a lot of these encrypted sequences that was previously unknown to have antibiotic properties and they were hidden messages in our own bodies that we didn’t know about before.
That led us to hypothesize that most likely, we would find similar sequences or similar molecules across the tree of life, not only included in the human body. And throughout evolution, that maybe we would find similar things. Since we have done the mining of the human proteome, we decided to next look at our closest relatives, which are neanderthals and denisovans to see if we could find similar molecules included in their genomes and their proteomes. In order to tackle that, we had to develop a machine learning model. It was a machine learning model that, in broad terms, was able to discover antibiotics, antibiotic molecules, included in the denisovans and the neanderthals. This was quite exciting. It was the first description of any sort of therapeutic molecule discovered in extinct organisms. We called this new field molecular de-extinction, so the notion of bringing back molecules from the past to address present day problems, which is [inaudible 00:19:33] from modern day systems.
One of the motivations was that perhaps molecules that existed hundreds of thousands of years ago could be brought back, because one of the rationales is that bacteria today, bacterial pathogens today, would have never encountered those molecules so maybe they’d give us a better chance of fighting off contemporary pathogens. We did that and it was truly an exciting moment when it was the time to actually resurrect those molecules using chemistry. Because as far as we know, some of them at least, they’re not expressed in living organisms, so they’re truly molecules from the past as far as we know. We really resurrected them using chemical methods. We brought them to life, if you will. Then we tested them on these bacteria and some of them were active against contemporary pathogens, not only in vitro but also in particular mouse models of infection.
This was really exciting and it convinced us that we could sample extinct organisms as a source of antimicrobials, as a source of potentially useful molecules. What we did next is that we just challenged ourselves and we said, “Why don’t we just sample every extinct organism known to science?” Instead of just denisovans, and neanderthals and modern humans. We decided to mine what we called extinctdom, which is essentially every extinct organism known in literature. In order to tackle this, we had to develop a more powerful machine learning model, so we developed a deep learning model that enabled us to mine hundreds of proteomes at a time. This work was extremely exciting. It really allowed us to discover new antibiotics in features from the past, including the wooly mammoth, the ancient elk and some ancient big wings, and even a giant sloth. That has a very interesting history. It was actually the remnants of this giant sloth were initially discovered by Charles Darwin in one of his expeditions to Patagonia.
We were able to find a whole new world of microbial molecules that were really previously just hidden in plain sight in all these extinct organisms. I’m talking about organisms going from the Holocene to the Pleistocene. Basically, we brought those molecules back to life and many of them have anti-infective activity, particularly for mouse models. We now have a bunch of clinical candidates coming from all that work.
Going back to the initial question of bioethics, from a bioethical standpoint, one of the things that worried me a little bit is throughout this molecularity extinction work, was what does it mean to bring back a molecule that used to exist hundreds of thousands of years ago, back to the present world? Is that okay to do that? From early on, we’ve been consulting with bioethicists to make sure that whatever we do in the lab, we do it responsibly. We truly believe that’s one of the ethos of the lab, about responsible innovation.
One of the things that we’ve been doing, for example, is on whatever molecule we synthesize chemically to make sure that it cannot self replicate. If it were to escape from the lab, it would not be able to self replicate in any way. We’ve also been obviously storing every molecule that we synthesize very securely and safely. That’s something, from a bioethical perspective, that we take very seriously. We’ve been talking to people that know a lot more about bioethics than us, that can really advise us as to how to do the science that we do in the best way possible.
The other thing that maybe I’ll mention that I thought it was pretty funny is that when we’re initially discovering all these new sequences, new molecules from extinct organisms, I went to the patent office here at the University of Pennsylvania. I had this funny conversation. I told them, “I know that natural sequences are not patentable,” because through the Myriad case, they were deemed not to be patentable, the natural sequences because they were produced as products of evolution. But I asked them, “What about extinct sequences or extinct molecules that existed tens of thousands of years ago but are no longer present in the living world, in the natural world that we live in?” They looked at me like I was crazy and they didn’t know what I was talking about. I explained a little bit more on the project. Then basically, they don’t know.
This is creating a new area of patent law, where our patent lawyers are trying to figure out whether molecules from the past might be patentable or not. I’ve actually written an opinion piece with a patent lawyer, an expert in the field, to try to provide a balanced view as to the pros and cons of patenting these sorts of molecules from extinct organisms.
Daniel Smith: Given the novelty of this work that you’re doing, which again is very fascinating, and also some of the issues that it’s raising like you were just talking about with the patentability, do you have any additional resources that you suggest where people can learn more? Or perhaps, even just advice that you have for folks that might be looking to also get into this type of work?
César de la Fuente: Yeah. These are all really emerging areas. I would say five, six years ago, AI for antibiotic discovery was not even a field. Molecular de-extinction is something tangible as of a year ago. But people can refer to our website. I also have LinkedIn and X accounts, and we try to report some of our latest finding there. I would say I really encourage young minds. If they’re interested in a problem of global magnitude that effects every corner of the world, I think antimicrobials is such a problem. It’s one of the most under invested areas that you can find in our society that actually attacks and kills the most people. It doesn’t not discriminate, it effects people everywhere in the world that you care about, that you think about. It’s predicted to get much worse.
Then, I would just like to highlight the importance of antibiotics. Antibiotics are not only helpful to treat an infection when we get infected and they save our lives, but they’re also critical for modern medicine as we know it. With things like even childbirth or chemotherapy treatments, where patients for example are immunocompromised, often times they can die as a consequence of infections. You need effective antibiotics to treat those patients. In surgeries, you need effective antibiotics. In every critical and routine procedure in modern medicine, you need to have effective antimicrobials. They really pressure medicine, so we need to value tremendously. Without good antibiotics, we face an uncertain future where we might go back to a pre-antibiotic era where people could just die through a simple scratch that gets infected. Also, I think it’s important to underscore that, over the last 100 years or so, humans, we’ve been able to almost double our lifespan and this is really thanks to three main pillars. Antibiotics, vaccines, and clean water. Imagine removing one of those pillars. We would be facing a really uncertain future where infections could really have a dramatic effect in our society.
Daniel Smith: Absolutely. Going back to the resources that you mentioned, I’ll certainly include links to those in our show notes so our listeners can learn more, and follow you, and connect.
I guess my final question for you then today is you mentioned earlier about how we’re increasing our abilities with computer power and also algorithmic design, and also these new fields that you’ve helped forge are advancing, so where do you see this field headed, given those things?
César de la Fuente: I think what I’m most excited about, and I think going back to the young minds, whoever is listening, we really need people, young talent, driven talent who wanted to change the world to actually join some of these efforts.
But I think what I’m dreaming about at the moment is that we will be able to use AI to really discover new antibiotics that can enter the clinic right away. In order to do that, we need to develop algorithms and not only optimize for antimicrobial activity, so to kill bacteria more effectively, but also they need to be optimized for all the other parameters that make a drug. This includes having molecules that are non-toxic against human cells, that are stable enough to be able to exert their antimicrobial function before they get degraded. That we need molecules that can distribute throughout the body and reach the infection site, and in fact be able to clear the infection. We need produced targeted effects as much as possible.
We are developing such parts of the toolkit, different algorithms that can optimize for each of those different properties that you actually need to be able to build a drug. But in the future, perhaps we’ll have an Amazon type system for personalized medicine. Let’s say if a patient has an infection caused by a specific bacteria pathogen, maybe there is a rapid diagnostic that tells us, “This is the pathogen that needs to be targeted.” Perhaps then an AI coupled with automation can quickly create an antibiotic specifically for that patient, specifically for that infection that is causing problems in that patient. Maybe that’s an imaginary feature right now, but maybe we see that someday.
Daniel Smith: That is a great place to leave our conversation for today. Thank you again, César.
César de la Fuente: It’s been a pleasure. Thank you, Daniel.
Daniel Smith: I also invite everyone to visit citiprogram.org to learn more about our courses and webinars on research, ethics and compliance. You may be interested in our Technology, Ethics, and Regulations course, which covers various technologies and their associated ethical issues and governance approaches. With that, I look forward to bringing you all more conversations on all things tech ethics.
How to Listen and Subscribe to the Podcast
You can find On Tech Ethics with CITI Program available from several of the most popular podcast services. Subscribe on your favorite platform to receive updates when episodes are newly released. You can also subscribe to this podcast, by pasting “https://feeds.buzzsprout.com/2120643.rss” into your your podcast apps.
Recent Episodes
- Season 1 – Episode 18: Use of RealResponse in Fostering Safe and Inclusive Work Environments
- Season 1 – Episode 17: Ethical Considerations for Artificial Placenta and Womb Technologies
- Season 1 – Episode 16: Privacy and Ethical Considerations for Extended Reality Settings
- Season 1 – Episode 15: Considerations for Using AI in IRB Operations
Meet the Guest
Cesar de la Fuente, PhD – University of Pennsylvania
Cesar de la Fuente has pioneered computational approaches that have accelerated antibiotic discovery, yielding numerous preclinical candidates. He has received over 70 awards, published over 140 papers, and is an elected Fellow of the American Institute for Medical and Biological Engineering, becoming one of the youngest ever to be inducted.
Meet the Host
Daniel Smith, Associate Director of Content and Education and Host of On Tech Ethics Podcast – CITI Program
As Associate Director of Content and Education at CITI Program, Daniel focuses on developing educational content in areas such as the responsible use of technologies, humane care and use of animals, and environmental health and safety. He received a BA in journalism and technical communication from Colorado State University.