Kevin B. Johnson, M.D., M.S., was featured in Cincinnati Children’s Hospital’s “Envisioning Our Future for Children” speaker series, discussing “the evolution of the EHR and its future directions.” An electronic health record, or EHR, is a digital record of a patient’s chart, recording health information and data, coordinating orders, tracking results, and providing patient support. Johnson “predicts a new wave of transformation in digital health technologies that could make rapid progress” in several areas of medicine, including reducing cost and improving patience outcomes. Johnson is Vice President for Applied Informatics at the University of Pennsylvania Health System and the David L. Cohen University Professor with appointments in Biostatistics, Epidemiology and Informatics and Computer and Information Science and secondary appointments in the Annenberg School for Communication, Pediatrics, and Bioengineering.
Konrad Kording, Nathan Francis Mossell University Professor in Bioengineering, Neuroscience, and Computer and Information Sciences, was appointed the Co-Director of the CIFAR Program in Learning in Machines & Brains. The appointment will start April 1, 2022.
CIFAR is a global research organization that convenes extraordinary minds to address the most important questions facing science and humanity. CIFAR was founded in 1982 and now includes over 400 interdisciplinary fellows and scholars, representing over 130 institutions and 22 countries. CIFAR supports research at all levels of development in areas ranging from Artificial Intelligence and child and brain development, to astrophysics and quantum computing. The program in Learning in Machines & Brains brings together international scientists to examine “how artificial neural networks could be inspired by the human brain, and developing the powerful technique of deep learning.” Scientists, industry experts, and policymakers in the program are working to understand the computational and mathematical principles behind learning, whether in brains or in machines, in order to understand human intelligence and improve the engineering of machine learning. As Co-Director, Kording will oversee the collective intellectual development of the LMB program which includes over 30 Fellows, Advisors, and Global Scholars. The program is also co-directed by Yoshua Benigo, the Canada CIFAR AI Chair and Professor in Computer Science and Operations Research at Université de Montréal.
Kording, a Penn Integrates Knowledge (PIK) Professor, was previously named an associate fellow of CIFAR in 2017. Kording’s groundbreaking interdisciplinary research uses data science to advance a broad range of topics that include understanding brain function, improving personalized medicine, collaborating with clinicians to diagnose diseases based on mobile phone data and even understanding the careers of professors. Across many areas of biomedical research, his group analyzes large datasets to test new models and thus get closer to an understanding of complex problems in bioengineering, neuroscience and beyond.
More data is being produced across diverse fields within science, engineering, and medicine than ever before, and our ability to collect, store, and manipulate it grows by the day. With scientists of all stripes reaping the raw materials of the digital age, there is an increasing focus on developing better strategies and techniques for refining this data into knowledge, and that knowledge into action.
Enter data science, where researchers try to sift through and combine this information to understand relevant phenomena, build or augment models, and make predictions.
One powerful technique in data science’s armamentarium is machine learning, a type of artificial intelligence that enables computers to automatically generate insights from data without being explicitly programmed as to which correlations they should attempt to draw.
Advances in computational power, storage, and sharing have enabled machine learning to be more easily and widely applied, but new tools for collecting reams of data from massive, messy, and complex systems—from electron microscopes to smart watches—are what have allowed it to turn entire fields on their heads.
“This is where data science comes in,” says Susan Davidson, Weiss Professor in Computer and Information Science (CIS) at Penn’s School of Engineering and Applied Science. “In contrast to fields where we have well-defined models, like in physics, where we have Newton’s laws and the theory of relativity, the goal of data science is to make predictions where we don’t have good models: a data-first approach using machine learning rather than using simulation.”
Penn Engineering’s formal data science efforts include the establishment of the Warren Center for Network & Data Sciences, which brings together researchers from across Penn with the goal of fostering research and innovation in interconnected social, economic and technological systems. Other research communities, including Penn Research in Machine Learning and the student-run Penn Data Science Group, bridge the gap between schools, as well as between industry and academia. Programmatic opportunities for Penn students include a Data Science minor for undergraduates, and a Master of Science in Engineering in Data Science, which is directed by Davidson and jointly administered by CIS and Electrical and Systems Engineering.
Penn academic programs and researchers on the leading edge of the data science field will soon have a new place to call home: Amy Gutmann Hall. The 116,000-square-foot, six-floor building, located on the northeast corner of 34th and Chestnut Streets near Lauder College House, will centralize resources for researchers and scholars across Penn’s 12 schools and numerous academic centers while making the tools of data analysis more accessible to the entire Penn community.
Faculty from all six departments in Penn Engineering are at the forefront of developing innovative data science solutions, primarily relying on machine learning, to tackle a wide range of challenges. Researchers show how they use data science in their work to answer fundamental questions in topics as diverse as genetics, “information pollution,” medical imaging, nanoscale microscopy, materials design, and the spread of infectious diseases.
Bioengineering: Unraveling the 3D genomic code
Scattered throughout the genomes of healthy people are tens of thousands of repetitive DNA sequences called short tandem repeats (STRs). But the unstable expansion of these repetitions is at the root of dozens of inherited disorders, including Fragile X syndrome, Huntington’s disease, and ALS. Why these STRs are susceptible to this disease-causing expansion, whereas most remain relatively stable, remains a major conundrum.
Complicating this effort is the fact that disease-associated STR tracts exhibit tremendous diversity in sequence, length, and localization in the genome. Moreover, that localization has a three-dimensional element because of how the genome is folded within the nucleus. Mammalian genomes are organized into a hierarchy of structures called topologically associated domains (TADs). Each one spans millions of nucleotides and contains smaller subTADs, which are separated by linker regions called boundaries.
“The genetic code is made up of three billion base pairs. Stretched out end to end, it is 6 feet 5 inches long, and must be subsequently folded into a nucleus that is roughly the size of a head of a pin,” says Jennifer Phillips-Cremins, associate professor and dean’s faculty fellow in Bioengineering. “Genome folding is an exciting problem for engineers to study because it is a problem of big data. We not only need to look for patterns along the axis of three billion base pairs of letters, but also along the axis of how the letters are folded into higher-order structures.”
To address this challenge, Phillips-Cremins and her team recently developed a new mathematical approach called 3DNetMod to accurately detect these chromatin domains in 3D maps of the genome in collaboration with the lab of Dani Bassett, J. Peter Skirkanich Professor in Bioengineering.
“In our group, we use an integrated, interdisciplinary approach relying on cutting-edge computational and molecular technologies to uncover biologically meaningful patterns in large data sets,” Phillips-Cremins says. “Our approach has enabled us to find patterns in data that classic biology training might overlook.”
In a recent study, Phillips-Cremins and her team used 3DNetMod to identify tens of thousands of subTADs in human brain tissue. They found that nearly all disease-associated STRs are located at boundaries demarcating 3D chromatin domains. Additional analyses of cells and brain tissue from patients with Fragile X syndrome revealed severe boundary disruption at a specific disease-associated STR.
“To our knowledge, these findings represent the first report of a possible link between STR instability and the mammalian genome’s 3D folding patterns,” Phillips-Cremins says. “The knowledge gained may shed new light into how genome structure governs function across development and during the onset and progression of disease. Ultimately, this information could be used to create molecular tools to engineer the 3D genome to control repeat instability.”
From smartphones and fitness trackers to social media posts and COVID-19 cases, the past few years have seen an explosion in the amount and types of data that are generated daily. To help make sense of these large, complex datasets, the field of data science has grown, providing methodologies, tools, and perspectives across a wide range of academic disciplines.
As part of its $750 million investment in science, engineering, and medicine, the University has committed to supporting the future needs of this field. To this end, the Innovation in Data Engineering and Science (IDEAS) initiative will help Penn become a leader in developing data-driven approaches that can transform scientific discovery, engineering research, and technological innovation.
“The IDEAS initiative is game-changing for our University,” says President Amy Gutmann. “This new investment allows us to boost our interdisciplinary efforts across campus, recruit phenomenal additional team members, and generate an even more sound foundation for discovery, experimentation, and design. This initiative is a clear statement that Penn is committed to taking data science head-on.”
“One of the unique things about data science and data engineering is that it’s a very horizontal technology, one that is going to be impacting every department on campus,” says George Pappas, Electrical and Systems Engineering Department chair. “When you have a horizontal technology in a competitive area, we have to figure out specific areas where Penn can become a worldwide leader.”
To do this, IDEAS aims to recruit new faculty across three research areas: artificial intelligence (AI) to transform scientific discovery, trustworthy AI for autonomous systems, and understanding connections between the human brain and AI.
In the area of neuroscience and how the human brain is similar to AI and machine learning approaches, research from PIK Professor Konrad Kording and Dani Bassett’sComplex Systems lab exemplifies the types of cross-disciplinary efforts that are essential for addressing complex questions. By recruiting additional faculty in this area, IDEAS will help Penn make strides in bio-inspired computing and in future life-changing discoveries that could address cognitive disorders and nervous system diseases.
When Nathan Francis Mossell graduated in 1882, he became the first African American to earn a medical degree from Penn. He soon became a prominent African American physician, the first to be elected to the Philadelphia County Medical Society. He helped found the Frederick Douglass Memorial Hospital and Training School, which treated Black patients and helped train the next generation of Black doctors and nurses.
“Dr. Mossell was truly inspiring. He had to fight for everything, yet never reneged on his principles. He pretty much started a hospital and was a major champion for the advancement of equality for African Americans,” Kording said. “In my research, where I study how intelligence works, I am inspired by scholars like him who combine many different insights. He was a wonderful man, and I will be proud to carry his name.”
While biologists and chemists race to develop new antibiotics to combat constantly mutating bacteria, predicted to lead to 10 million deaths by 2050, engineers are approaching the problem through a different lens: finding naturally occurring antibiotics in the human genome.
The billions of base pairs in the genome are essentially one long string of code that contains the instructions for making all of the molecules the body needs. The most basic of these molecules are amino acids, the building blocks for peptides, which in turn combine to form proteins. However, there is still much to learn about how — and where — a particular set of instructions are encoded.
Now, bringing a computer science approach to a life science problem, an interdisciplinary team of Penn researchers have used a carefully designed algorithm to discover a new suite of antimicrobial peptides, hiding deep within this code.
The study, published in Nature Biomedical Engineering, was led by César de la Fuente, Presidential Assistant Professor in Bioengineering, Microbiology, Psychiatry, and Chemical and Biomolecular Engineering, spanning both Penn Engineering and Penn Medicine, and his postdocs Marcelo Torres and Marcelo Melo. Collaborators Orlando Crescenzi and Eugenio Notomista of the University of Naples Federico II also contributed to this work.
“The human body is a treasure trove of information, a biological dataset. By using the right tools, we can mine for answers to some of the most challenging questions,” says de la Fuente. “We use the word ‘encrypted’ to describe the antimicrobial peptides we found because they are hidden within larger proteins that seem to have no connection to the immune system, the area where we expect to find this function.”
The murder of George Floyd, an unarmed Black man who was killed by a White police officer, affected the mental well-being of many Americans. The effects were multifaceted as it was an act of police brutality and example of systemic racism that occurred during the uncertainty of a global pandemic, creating an even more complex dynamic and emotional response.
Because poor mental health can lead to a myriad of additional ailments, including poor physical health, inability to hold a job and an overall decrease in quality of life, it is important to understand how certain events affect it. This is especially critical when the emotional burden of these events falls most on demographics affected by systemic racism. However, unlike physical health, mental health is challenging to characterize and measure, and thus, population-level data on mental health has been limited.
To better understand patterns of mental health on a population scale, Penn Engineers Lyle H. Ungar, Professor of Computer and Information Science (CIS), and Sharath Chandra Guntuku, Research Assistant Professor in CIS, take a computational approach to this challenge. Drawing on large-scale surveys as well as language analysis in social media through their work with the World Well-Being Project, they have developed visualizations of these patterns across the U.S.
Their latest study involves tracking changes in emotional and mental health following George Floyd’s murder. Combining polling data from the U.S. Census and Gallup, Guntuku, Ungar and colleagues have shown that Floyd’s murder spiked a wave of unprecedented sadness and anger across the U.S. population, the largest since relevant data began being recorded in 2009.
Last month, the second annual Women in Data Science (WiDS) @ Penn Conference virtually gathered nearly 500 registrants to participate in a week’s worth of academic and industry talks, live speaker Q&A sessions, and networking opportunities.
Following welcoming remarks from Erika James, Dean of the Wharton School, and Vijay Kumar, Nemirovsky Family Dean of Penn Engineering, the conference began with a keynote address from President of Microsoft US and Wharton alumna Kate Johnson.
Conference sessions continued throughout the week, featuring panels of academic data scientists from around Penn and beyond, industry leaders from IKEA Digital, Facebook and Poshmark, and lightning talks from students speakers who presented their data science research.
All of the conference’s sessions are now available on YouTube and the 2021 WiDS Conference Recap, including a talk titled “How Humans Build Models for the World” by Danielle Bassett, J. Peter Skirkanich Professor in Bioengineering and Electrical and Systems Engineering.
A Q&A with neuroscientist Konrad Kording on how connections between minds and machines are portrayed in popular culture, and what the future holds for this reality-defying technology.
For the many superheroes that use high-powered gadgets to save the day, there’s an equal number of villains who use technology nefariously. From robots that plug into human brains for fuel in “The Matrix” to the memory-warping devices seen in “Men in Black,” “Captain Marvel,” and “Total Recall,” technology that can control people’s minds is one of the most terrifying examples of technology gone wrong in science fiction and superhero films.
Now, progress made on brain-machine interfaces, technology that provides a direct communication link between a brain and an external device, is bringing us closer to a world that feels like science fiction. Elon Musk’s company NeuraLink is working on a device to let people control computers with their minds, while Facebook’s “mind-reading initiative” can decode speech from brain activity. Is this progress a glimpse into a dark future, or are there more empowering ways in which brain-machine interfaces could become a force for good?
Q: What are the main challenges in connecting brains to devices?
The key problem is that you need to get a lot of information out of brains. Today’s prosthetic devices are very slow, and if we want to go faster it’s a tradeoff: I can go slower and then I am more precise, or I can go faster and be more noisy. We need to get more data out of brains, and we want to do it electrically, meaning we need to get more electrodes into brains.
So what do you need? You need a way of getting electrodes into the brain without making your brain into a pulp, you want the electrodes to be flexible so they can stay in longer, and then you want the system to be wireless. You don’t want to have a big connector on the top of your head.
It’s primarily a hardware problem. We can get electrodes into brains, but they deteriorate quickly because they are too thick. We can have plugs on people’s heads, but it’s ruling out any real-world usage. All these factors hold us back at the moment.
That’s why the Neuralink announcement was very interesting. They get a rather large number of electrodes into brains using well-engineered approaches that make that possible. What makes the difference is that Neuralink takes the best ideas in all the different domains and puts them together.
Q: Most examples in pop culture of connecting brains to machines have villainous or nefarious ends. Does that match up with how brain-machine interfaces are currently being developed?
Let’s say you’ve had a stroke, you can’t talk, but there’s a prosthetic device that allows you to talk again. Or if you lost your arm, and you get a new one that’s as good as the original—that’s absolutely a force for good.
It’s not a dark, ugly future thing, it’s a beautiful step forward for medicine. I want to make massive progress in these diseases. I want patients who had a stroke to talk again; I want vets to have prosthetic devices that are as good as the real thing. I think short-term this is what’s going to happen, but we are starting to worry about the dark sides.