In a recent CNN feature, César de la Fuente, Presidential Assistant Professor in Bioengineering, Psychiatry, Microbiology, and in Chemical and Biomolecular Engineering commented on a study about a new type of antibiotic that was discovered with artificial intelligence:
“I think AI, as we’ve seen, can be applied successfully in many domains, and I think drug discovery is sort of the next frontier.”
The de la Fuente lab uses machine learning and biology to help prevent, detect, and treat infectious diseases, and is pioneering the research and discovery of new antibiotics.
Machine learning (ML) programs computers to learn the way we do – through the continual assessment of data and identification of patterns based on past outcomes. ML can quickly pick out trends in big datasets, operate with little to no human interaction and improve its predictions over time. Due to these abilities, it is rapidly finding its way into medical research.
People with breast cancer may soon be diagnosed through ML faster than through a biopsy. Those suffering from depression might be able to predict mood changes through smart phone recordings of daily activities such as the time they wake up and amount of time they spend exercising. ML may also help paralyzed people regain autonomy using prosthetics controlled by patterns identified in brain scan data. ML research promises these and many other possibilities to help people lead healthier lives.
But while the number of ML studies grow, the actual use of it in doctors’ offices has not expanded much past simple functions such as converting voice to text for notetaking.
The limitations lie in medical research’s small sample sizes and unique datasets. This small data makes it hard for machines to identify meaningful patterns. The more data, the more accuracy in ML diagnoses and predictions. For many diagnostic uses, massive numbers of subjects in the thousands would be needed, but most studies use smaller numbers in the dozens of subjects.
But there are ways to find significant results from small datasets if you know how to manipulate the numbers. Running statistical tests over and over again with different subsets of your data can indicate significance in a dataset that in reality may be just random outliers.
This tactic, known as P-hacking or feature hacking in ML, leads to the creation of predictive models that are too limited to be useful in the real world. What looks good on paper doesn’t translate to a doctor’s ability to diagnose or treat us.
These statistical mistakes, oftentimes done unknowingly, can lead to dangerous conclusions.
To help scientists avoid these mistakes and push ML applications forward, Konrad Kording, Nathan Francis Mossell University Professor with appointments in the Departments of Bioengineering and Computer and Information Science in Penn Engineering and the Department of Neuroscience at Penn’s Perelman School of Medicine, is leading an aspect of a large, NIH-funded program known as CENTER – Creating an Educational Nexus for Training in Experimental Rigor. Kording will lead Penn’s cohort by creating the Community for Rigor which will provide open-access resources on conducting sound science. Members of this inclusive scientific community will be able to engage with ML simulations and discussion-based courses.
“The reason for the lack of ML in real-world scenarios is due to statistical misuse rather than the limitations of the tool itself,” says Kording. “If a study publishes a claim that seems too good to be true, it usually is, and many times we can track that back to their use of statistics.”
Such studies that make their way into peer-reviewed journals contribute to misinformation and mistrust in science and are more common than one might expect.
Yi-An Hsieh, a fourth year Bioengineering student from Anaheim, California, worked remotely this summer on a team that spanned three labs, including the Kamoun Lab at the Hospital of the University of Pennsylvania. Hsieh credits her research on kidney graft failure with enriching her scientific skill set, exposing her to machine learning and real-time interaction with genetic datasets. In a guest post for the Career Services Blog, Hseih writes about her remote summer internship experience. “It showed me that this type of research energy that could not be dampened despite the distance,” she writes.
Developing new soft materials requires new data-driven research techniques, such as autonomous experimentation. Data regarding nanometer-scale material structure, taken by X-ray measurements at a synchrotron, can be fed into an algorithm that identifies the most relevant features, represented here as red dots. The algorithm then determines the optimum conditions for the next set of measurements and directs their execution without human intervention. Brookhaven National Laboratory’s Kevin Yager, who helped develop this technique, will co-teach a course on it as part of a new Penn project on Data Driven Soft Materials Research.
The National Science Foundation’s Research Traineeship Program aims to support graduate students, educate the STEM leaders of tomorrow and strengthen the national research infrastructure. The program’s latest series of grants are going toward university programs focused on artificial intelligence and quantum information science and engineering – two areas of high priority in academia, industry and government.
Chinedum Osuji, Eduardo D. Glandt Presidential Professor and Chair of the Department of Chemical and Biomolecular Engineering (CBE), has received one of these grants to apply data science and machine learning to the field of soft materials. The grant will provide five years of support and a total of $3 million for a new Penn project on Data Driven Soft Materials Research.
Osuji will work with co-PIs Russell Composto, Professor and Howell Family Faculty Fellow in Materials Science and Engineering, Bioengineering, and in CBE, Zahra Fakhraai, Associate Professor of Chemistry in Penn’s School of Arts & Sciences (SAS) with a secondary appointment in CBE, Paris Perdikaris, Assistant Professor in Mechanical Engineering and Applied Mechanics, and Andrea Liu, Hepburn Professor of Physics and Astronomy in SAS, all of whom will help run the program and provide the connections between the multiple fields of study where its students will train.
These and other affiliated faculty members will work closely with co-PI Kristin Field, who will serve as Program Coordinator and Director of Education.
César de la Fuente, Presidential Assistant Professor in Psychiatry, Bioengineering, Microbiology, and in Chemical and Biomolecular Engineering has been honored with a 2022 Young Investigator Award by the Royal Spanish Society of Chemistry (RSEQ) for his pioneering research efforts to combine the power of machines and biology to help prevent, detect, and treat infectious diseases.
Five University of Pennsylvania undergraduates have received 2022 Goldwater Scholarships, including Laila Barakat Norford, a third year Bioengineering major from Wayne, Pennsylvania. Goldwater Scholarships are awarded to sophomores or juniors planning research careers in mathematics, the natural sciences, or engineering.
Penn has produced 23 Goldwater Scholars in the past seven years and a total of 55 since Congress established the scholarship in 1986.
Laila Barakat Norford is majoring in bioengineering with minors in computer science and bioethics in Penn Engineering. As a Rachleff Scholar, Norford has been engaged in systems biology research since her first year. Her current research uses machine learning to predict cell types in intestinal organoids from live-cell images, enabling the mechanisms of development and disease to be characterized in detail. At Penn, she is an Orientation Peer Advisor, a volunteer with Advancing Women in Engineering and the Penn Society of Women Engineers, and a teaching assistant for introductory computer science. She is secretary of the Penn Band, plays the clarinet, and is a member of the Band’s Fanfare Honor Society for service and leadership. Norford registers voters with Penn Leads the Vote and canvasses for state government candidates. She is also involved in Penn’s LGBTQ+ community as a member of PennAces. Norford plans to pursue a Ph.D. in computational biology, aspiring to build computational tools to address understudied diseases and health disparities.
Konrad Kording, Nathan Francis Mossell University Professor in Bioengineering, Neuroscience, and Computer and Information Sciences, was appointed the Co-Director of the CIFAR Program in Learning in Machines & Brains. The appointment will start April 1, 2022.
CIFAR is a global research organization that convenes extraordinary minds to address the most important questions facing science and humanity. CIFAR was founded in 1982 and now includes over 400 interdisciplinary fellows and scholars, representing over 130 institutions and 22 countries. CIFAR supports research at all levels of development in areas ranging from Artificial Intelligence and child and brain development, to astrophysics and quantum computing. The program in Learning in Machines & Brains brings together international scientists to examine “how artificial neural networks could be inspired by the human brain, and developing the powerful technique of deep learning.” Scientists, industry experts, and policymakers in the program are working to understand the computational and mathematical principles behind learning, whether in brains or in machines, in order to understand human intelligence and improve the engineering of machine learning. As Co-Director, Kording will oversee the collective intellectual development of the LMB program which includes over 30 Fellows, Advisors, and Global Scholars. The program is also co-directed by Yoshua Benigo, the Canada CIFAR AI Chair and Professor in Computer Science and Operations Research at Université de Montréal.
Kording, a Penn Integrates Knowledge (PIK) Professor, was previously named an associate fellow of CIFAR in 2017. Kording’s groundbreaking interdisciplinary research uses data science to advance a broad range of topics that include understanding brain function, improving personalized medicine, collaborating with clinicians to diagnose diseases based on mobile phone data and even understanding the careers of professors. Across many areas of biomedical research, his group analyzes large datasets to test new models and thus get closer to an understanding of complex problems in bioengineering, neuroscience and beyond.
Visit Kording’s lab website and CIFAR profile page to learn more about his work in neuroscience, data science, and deep learning.
The Very Large Scale Microfluidic Integration (VLSMI) platform, a technology developed by the Penn researchers, contains hundreds of mixing channels for mass-producing mRNA-carrying lipid nanoparticles.
Penn Engineering secured a multi-million-dollar contract with Wellcome Leap under the organization’s $60 million RNA Readiness + Response (R3) program, which is jointly funded with the Coalition for Epidemic Preparedness Innovations (CEPI). Penn Engineers aim to create “on-demand” manufacturing technology that can produce a range of RNA-based vaccines.
The Penn Engineering team features Daeyeon Lee, Evan C Thompson Term Chair for Excellence in Teaching and Professor in Chemical and Biomolecular Engineering, Michael Mitchell, Skirkanich Assistant Professor of Innovation in Bioengineering, David Issadore, Associate Professor in Bioengineering and Electrical and Systems Engineering, and Sagar Yadavali, a former postdoctoral researcher in the Issadore and Lee labs and now the CEO of InfiniFluidics, a spinoff company based on their research. Drew Weissman of the Perelman School of Medicine, whose foundational research directly continued to the development of mRNA-based COVID-19 vaccines, is also a part of this interdisciplinary team.
The success of these COVID-19 vaccines has inspired a fresh perspective and wave of research funding for RNA therapeutics across a wide range of difficult diseases and health issues. These therapeutics now need to be equitably and efficiently distributed, something currently limited by the inefficient mRNA vaccine manufacturing processes which would rapidly translate technologies from the lab to the clinic.
As part of a major University-wide investment in science, engineering, and medicine, the Innovation in Data Engineering and Science Initiative aims to help Penn become a leader in developing data-driven approaches that can transform scientific discovery, engineering research, and technological innovation.
From smartphones and fitness trackers to social media posts and COVID-19 cases, the past few years have seen an explosion in the amount and types of data that are generated daily. To help make sense of these large, complex datasets, the field of data science has grown, providing methodologies, tools, and perspectives across a wide range of academic disciplines.
As part of its $750 million investment in science, engineering, and medicine, the University has committed to supporting the future needs of this field. To this end, the Innovation in Data Engineering and Science (IDEAS) initiative will help Penn become a leader in developing data-driven approaches that can transform scientific discovery, engineering research, and technological innovation.
“The IDEAS initiative is game-changing for our University,” says President Amy Gutmann. “This new investment allows us to boost our interdisciplinary efforts across campus, recruit phenomenal additional team members, and generate an even more sound foundation for discovery, experimentation, and design. This initiative is a clear statement that Penn is committed to taking data science head-on.”
“One of the unique things about data science and data engineering is that it’s a very horizontal technology, one that is going to be impacting every department on campus,” says George Pappas, Electrical and Systems Engineering Department chair. “When you have a horizontal technology in a competitive area, we have to figure out specific areas where Penn can become a worldwide leader.”
To do this, IDEAS aims to recruit new faculty across three research areas: artificial intelligence (AI) to transform scientific discovery, trustworthy AI for autonomous systems, and understanding connections between the human brain and AI.
In the area of neuroscience and how the human brain is similar to AI and machine learning approaches, research from PIK Professor Konrad Kording and Dani Bassett’sComplex Systems lab exemplifies the types of cross-disciplinary efforts that are essential for addressing complex questions. By recruiting additional faculty in this area, IDEAS will help Penn make strides in bio-inspired computing and in future life-changing discoveries that could address cognitive disorders and nervous system diseases.
Dani S. Bassett, J. Peter Skirkanich Professor in Bioengineering and in Electrical and Systems Engineering
Bassett runs the Complex Systems lab which tackles problems at the intersection of science, engineering, and medicine using systems-level approaches, exploring fields such as curiosity, dynamic networks in neuroscience, and psychiatric disease. They are a pioneer in the emerging field of network science which combines mathematics, physics, biology and systems engineering to better understand how the overall shape of connections between individual neurons influences cognitive traits.
Jason Burdick, Ph.D.
Jason A. Burdick, Robert D. Bent Professor in Bioengineering
Burdick runs the Polymeric Biomaterials Laboratory which develops polymer networks for fundamental and applied studies with biomedical applications with a specific emphasis on tissue regeneration and drug delivery. The specific targets of his research include: scaffolding for cartilage regeneration, controlling stem cell differentiation through material signals, electrospinning and 3D printing for scaffold fabrication, and injectable hydrogels for therapies after a heart attack.
César de la Fuente, Ph.D.
César de la Fuente, Presidential Assistant Professor in Bioengineering and Chemical & Biomedical Engineering in Penn Engineering and in Microbiology and Psychiatry in the Perelman School of Medicine
De la Fuente runs the Machine Biology Group which combines the power of machines and biology to prevent, detect, and treat infectious diseases. He pioneered the development of the first antibiotic designed by a computer with efficacy in animals, designed algorithms for antibiotic discovery, and invented rapid low-cost diagnostics for COVID-19 and other infections.
Carl June, M.D.
Carl H. June, Richard W. Vague Professor in Immunotherapy in the Perelman School of Medicine and member of the Bioengineering Graduate Group
June is the Director for the Center for Cellular Immunotherapies and the Parker Institute for Cancer Therapy and runs the June Lab which develops new forms of T cell based therapies. June’s pioneering research in gene therapy led to the FDA approval for CAR T therapy for treating acute lymphoblastic leukemia (ALL), one of the most common childhood cancers.
Vivek Shenoy, Ph.D.
Vivek Shenoy, Eduardo D. Glandt President’s Distinguished Professor in Bioengineering, Mechanical Engineering and Applied Mechanics (MEAM), and in Materials Science and Engineering (MSE)
Shenoy runs the Theoretical Mechanobiology and Materials Lab which develops theoretical concepts and numerical principles for understanding engineering and biological systems. His analytical methods and multiscale modeling techniques gain insight into a myriad of problems in materials science and biomechanics.
The highly anticipated annual list identifies researchers who demonstrated significant influence in their chosen field or fields through the publication of multiple highly cited papers during the last decade. Their names are drawn from the publications that rank in the top 1% by citations for field and publication year in the Web of Science™ citation index.
Bassett and Burdick were both on the Highly Cited Researchers list in 2019 and 2020.
The methodology that determines the “who’s who” of influential researchers draws on the data and analysis performed by bibliometric experts and data scientists at the Institute for Scientific Information™ at Clarivate. It also uses the tallies to identify the countries and research institutions where these scientific elite are based.
David Pendlebury, Senior Citation Analyst at the Institute for Scientific Information at Clarivate, said: “In the race for knowledge, it is human capital that is fundamental and this list identifies and celebrates exceptional individual researchers who are having a great impact on the research community as measured by the rate at which their work is being cited by others.”
The full 2021 Highly Cited Researchers list and executive summary can be found online here.