Refining Data into Knowledge, Turning Knowledge into Action

by Janelle Weaver

Heatmaps are used by researchers in the lab of Jennifer Phillips-Cremins to visualize which physically distant genes are brought into contact when the genome is in its folded state.

More data is being produced across diverse fields within science, engineering, and medicine than ever before, and our ability to collect, store, and manipulate it grows by the day. With scientists of all stripes reaping the raw materials of the digital age, there is an increasing focus on developing better strategies and techniques for refining this data into knowledge, and that knowledge into action.

Enter data science, where researchers try to sift through and combine this information to understand relevant phenomena, build or augment models, and make predictions.

One powerful technique in data science’s armamentarium is machine learning, a type of artificial intelligence that enables computers to automatically generate insights from data without being explicitly programmed as to which correlations they should attempt to draw.

Advances in computational power, storage, and sharing have enabled machine learning to be more easily and widely applied, but new tools for collecting reams of data from massive, messy, and complex systems—from electron microscopes to smart watches—are what have allowed it to turn entire fields on their heads.

“This is where data science comes in,” says Susan Davidson, Weiss Professor in Computer and Information Science (CIS) at Penn’s School of Engineering and Applied Science. “In contrast to fields where we have well-defined models, like in physics, where we have Newton’s laws and the theory of relativity, the goal of data science is to make predictions where we don’t have good models: a data-first approach using machine learning rather than using simulation.”

Penn Engineering’s formal data science efforts include the establishment of the Warren Center for Network & Data Sciences, which brings together researchers from across Penn with the goal of fostering research and innovation in interconnected social, economic and technological systems. Other research communities, including Penn Research in Machine Learning and the student-run Penn Data Science Group, bridge the gap between schools, as well as between industry and academia. Programmatic opportunities for Penn students include a Data Science minor for undergraduates, and a Master of Science in Engineering in Data Science, which is directed by Davidson and jointly administered by CIS and Electrical and Systems Engineering.

Penn academic programs and researchers on the leading edge of the data science field will soon have a new place to call home: Amy Gutmann Hall. The 116,000-square-foot, six-floor building, located on the northeast corner of 34th and Chestnut Streets near Lauder College House, will centralize resources for researchers and scholars across Penn’s 12 schools and numerous academic centers while making the tools of data analysis more accessible to the entire Penn community.

Faculty from all six departments in Penn Engineering are at the forefront of developing innovative data science solutions, primarily relying on machine learning, to tackle a wide range of challenges. Researchers show how they use data science in their work to answer fundamental questions in topics as diverse as genetics, “information pollution,” medical imaging, nanoscale microscopy, materials design, and the spread of infectious diseases.

Bioengineering: Unraveling the 3D genomic code

Scattered throughout the genomes of healthy people are tens of thousands of repetitive DNA sequences called short tandem repeats (STRs). But the unstable expansion of these repetitions is at the root of dozens of inherited disorders, including Fragile X syndrome, Huntington’s disease, and ALS. Why these STRs are susceptible to this disease-causing expansion, whereas most remain relatively stable, remains a major conundrum.

Complicating this effort is the fact that disease-associated STR tracts exhibit tremendous diversity in sequence, length, and localization in the genome. Moreover, that localization has a three-dimensional element because of how the genome is folded within the nucleus. Mammalian genomes are organized into a hierarchy of structures called topologically associated domains (TADs). Each one spans millions of nucleotides and contains smaller subTADs, which are separated by linker regions called boundaries.

Associate professor and Dean’s Faculty Fellow Jennifer E. Phillips-Cremins.

“The genetic code is made up of three billion base pairs. Stretched out end to end, it is 6 feet 5 inches long, and must be subsequently folded into a nucleus that is roughly the size of a head of a pin,” says Jennifer Phillips-Cremins, associate professor and dean’s faculty fellow in Bioengineering. “Genome folding is an exciting problem for engineers to study because it is a problem of big data. We not only need to look for patterns along the axis of three billion base pairs of letters, but also along the axis of how the letters are folded into higher-order structures.”

To address this challenge, Phillips-Cremins and her team recently developed a new mathematical approach called 3DNetMod to accurately detect these chromatin domains in 3D maps of the genome in collaboration with the lab of Dani Bassett, J. Peter Skirkanich Professor in Bioengineering.

“In our group, we use an integrated, interdisciplinary approach relying on cutting-edge computational and molecular technologies to uncover biologically meaningful patterns in large data sets,” Phillips-Cremins says. “Our approach has enabled us to find patterns in data that classic biology training might overlook.”

In a recent study, Phillips-Cremins and her team used 3DNetMod to identify tens of thousands of subTADs in human brain tissue. They found that nearly all disease-associated STRs are located at boundaries demarcating 3D chromatin domains. Additional analyses of cells and brain tissue from patients with Fragile X syndrome revealed severe boundary disruption at a specific disease-associated STR.

“To our knowledge, these findings represent the first report of a possible link between STR instability and the mammalian genome’s 3D folding patterns,” Phillips-Cremins says. “The knowledge gained may shed new light into how genome structure governs function across development and during the onset and progression of disease. Ultimately, this information could be used to create molecular tools to engineer the 3D genome to control repeat instability.”

Read the full story in Penn Today.

Investing in Penn’s Data Science Ecosystem

by Erica K. Brockmeier

As part of a major University-wide investment in science, engineering, and medicine, the Innovation in Data Engineering and Science Initiative aims to help Penn become a leader in developing data-driven approaches that can transform scientific discovery, engineering research, and technological innovation.

From smartphones and fitness trackers to social media posts and COVID-19 cases, the past few years have seen an explosion in the amount and types of data that are generated daily. To help make sense of these large, complex datasets, the field of data science has grown, providing methodologies, tools, and perspectives across a wide range of academic disciplines.

But the challenges that lie ahead for data scientists and engineers, from developing algorithms that don’t exacerbate biases to ensuring privacy protections, are equally complex and, in some instances, require entirely new ways of thinking.

As part of its $750 million investment in science, engineering, and medicine, the University has committed to supporting the future needs of this field. To this end, the Innovation in Data Engineering and Science (IDEAS) initiative will help Penn become a leader in developing data-driven approaches that can transform scientific discovery, engineering research, and technological innovation.

“The IDEAS initiative is game-changing for our University,” says President Amy Gutmann. “This new investment allows us to boost our interdisciplinary efforts across campus, recruit phenomenal additional team members, and generate an even more sound foundation for discovery, experimentation, and design. This initiative is a clear statement that Penn is committed to taking data science head-on.”

Building on a foundation of existing expertise

Led by the School of Engineering and Applied Science, the IDEAS initiative builds upon the steadily gathering momentum of its data-centric research. The Warren Center for Network and Data Sciences has been a major catalyst for this type of work, generating foundational research on ethical algorithms and data privacy, as well as collaborations that have drawn in faculty from the Wharton School, Law School, Perelman School of Medicine, and beyond. In addition, Wharton’s Department of Statistics and Data Science is an active partner in research and teaching initiatives that apply statistical modeling across a wide variety of fields.

“One of the unique things about data science and data engineering is that it’s a very horizontal technology, one that is going to be impacting every department on campus,” says George Pappas, Electrical and Systems Engineering Department chair. “When you have a horizontal technology in a competitive area, we have to figure out specific areas where Penn can become a worldwide leader.”

To do this, IDEAS aims to recruit new faculty across three research areas: artificial intelligence (AI) to transform scientific discovery, trustworthy AI for autonomous systems, and understanding connections between the human brain and AI.

Penn already has a strong foundation in using AI for scientific discovery thanks in part to investments in basic research facilities such as the Singh Center for Nanotechnology and the Laboratory for Research on the Structure of Matter. Additionally, there are centers focused on connecting researchers from different fields to address complex scientific questions, including the Center for Soft and Living Matter, Center for Engineering Mechanobiology, and Penn Institute for Computational Science.

Developing “trustworthy” algorithms, ones that work reliably outside of situations in which they are trained, is another key component of the IDEAS initiative. Ongoing research at the Penn Research in Embedded Computing and Integrated Systems Engineering (PRECISE) Center, the General Robotics, Automation, Sensing & Perception (GRASP) Lab, and DARPA-funded projects on the safety of AI-based aircraft control provide a starting point for furthering Penn’s research portfolio on safe, explainable, and trustworthy autonomous systems.

In the area of neuroscience and how the human brain is similar to AI and machine learning approaches, research from PIK Professor Konrad Kording and Dani Bassett’s Complex Systems lab exemplifies the types of cross-disciplinary efforts that are essential for addressing complex questions. By recruiting additional faculty in this area, IDEAS will help Penn make strides in bio-inspired computing and in future life-changing discoveries that could address cognitive disorders and nervous system diseases.

Read the full story in Penn Today.

Penn Bioengineering Celebrates Five Researchers on Highly Cited Researchers 2021 List

The Department of Bioengineering is proud to announce that five of our faculty have been named on the annual Highly Cited Researchers™ 2021 list from Clarivate:

Dani Bassett, Ph.D.

Dani S. Bassett, J. Peter Skirkanich Professor in Bioengineering and in Electrical and Systems Engineering
Bassett runs the Complex Systems lab which tackles problems at the intersection of science, engineering, and medicine using systems-level approaches, exploring fields such as curiosity, dynamic networks in neuroscience, and psychiatric disease. They are a pioneer in the emerging field of network science which combines mathematics, physics, biology and systems engineering to better understand how the overall shape of connections between individual neurons influences cognitive traits.

Robert D. Bent Chair
Jason Burdick, Ph.D.

Jason A. Burdick, Robert D. Bent Professor in Bioengineering
Burdick runs the Polymeric Biomaterials Laboratory which develops polymer networks for fundamental and applied studies with biomedical applications with a specific emphasis on tissue regeneration and drug delivery. The specific targets of his research include: scaffolding for cartilage regeneration, controlling stem cell differentiation through material signals, electrospinning and 3D printing for scaffold fabrication, and injectable hydrogels for therapies after a heart attack.

César de la Fuente, Ph.D.

César de la Fuente, Presidential Assistant Professor in Bioengineering and Chemical & Biomedical Engineering in Penn Engineering and in Microbiology and Psychiatry in the Perelman School of Medicine
De la Fuente runs the Machine Biology Group which combines the power of machines and biology to prevent, detect, and treat infectious diseases. He pioneered the development of the first antibiotic designed by a computer with efficacy in animals, designed algorithms for antibiotic discovery, and invented rapid low-cost diagnostics for COVID-19 and other infections.

Carl June, M.D.

Carl H. June, Richard W. Vague Professor in Immunotherapy in the Perelman School of Medicine and member of the Bioengineering Graduate Group
June is the Director for the Center for Cellular Immunotherapies and the Parker Institute for Cancer Therapy and runs the June Lab which develops new forms of T cell based therapies. June’s pioneering research in gene therapy led to the FDA approval for CAR T therapy for treating acute lymphoblastic leukemia (ALL), one of the most common childhood cancers.

Vivek Shenoy, Ph.D.

Vivek Shenoy, Eduardo D. Glandt President’s Distinguished Professor in Bioengineering, Mechanical Engineering and Applied Mechanics (MEAM), and in Materials Science and Engineering (MSE)
Shenoy runs the Theoretical Mechanobiology and Materials Lab which develops theoretical concepts and numerical principles for understanding engineering and biological systems. His analytical methods and multiscale modeling techniques gain insight into a myriad of problems in materials science and biomechanics.

The highly anticipated annual list identifies researchers who demonstrated significant influence in their chosen field or fields through the publication of multiple highly cited papers during the last decade. Their names are drawn from the publications that rank in the top 1% by citations for field and publication year in the Web of Science™ citation index.

Bassett and Burdick were both on the Highly Cited Researchers list in 2019 and 2020.

The methodology that determines the “who’s who” of influential researchers draws on the data and analysis performed by bibliometric experts and data scientists at the Institute for Scientific Information™ at Clarivate. It also uses the tallies to identify the countries and research institutions where these scientific elite are based.

David Pendlebury, Senior Citation Analyst at the Institute for Scientific Information at Clarivate, said: “In the race for knowledge, it is human capital that is fundamental and this list identifies and celebrates exceptional individual researchers who are having a great impact on the research community as measured by the rate at which their work is being cited by others.”

The full 2021 Highly Cited Researchers list and executive summary can be found online here.

Dani Bassett Elected an American Physical Society Fellow

Dani Bassett, Ph.D.

Dani S. Bassett,  J. Peter Skirkanich Professor in the departments of Bioengineering and Electrical and Systems Engineering, has been elected a 2021 Fellow of the American Physical Society (APS) “for significant contributions to the network modeling of the human brain, including dynamical changes caused by evolution, learning, aging, and disease.”

The prestigious APS Fellowship Program signifies recognition by one’s professional peers. Each year, no more than one half of one percent of the APS membership is recognized with this distinct honor. Bassett’s election and groundbreaking work in biological physics and network science will be recognized through presentation of a certificate at the APS March Meeting.

Bassett is a pioneer in the field of network neuroscience, an emerging subfield which incorporates elements of mathematics, physics,  biology and systems engineering to better understand how the overall shape of connections between individual neurons influences cognitive traits. They lead the Complex Systems lab which tackles problems at the intersection of science, engineering, and medicine using systems-level approaches, exploring fields such as curiosity, dynamic networks in neuroscience, and psychiatric disease.

Bassett recently collaborated with Penn artist-in-residence Rebecca Kamen and other scholars on an interdisciplinary art exhibit on the creative process in art and science at the Katzen Art Center at American University. They have also published research modeling different types of curiosity and exploring gender-based citation bias in neuroscience publishing.

“I’m thrilled and humbled to receive this honor from the American Physical Society,” says Bassett. “I am indebted to the many fantastic mentees, colleagues, and mentors that have made my time in science such an exciting adventure. Thank you.”

Read more stories about Bassett’s research here.

Reimagining Scientific Discovery Through the Lens of an Artist

by Erica K. Brockmeier

Rebecca Kamen, Penn artist-in-residence and visiting scholar, has a new exhibition titled “Reveal: The Art of Reimagining Scientific Discovery” at American University Museum at the Katzen Arts Center that explores curiosity and the creative process across art and science. (Image: Greg Staley)

Rebecca Kamen, Penn artist-in-residence and visiting scholar, has long been interested in science and the natural world. As a Philadelphia native and an artist with a 40-plus-year career, her intersectional work sheds light on the process of scientific discovery and its connections to art, with previous exhibitions that celebrate Apollo 11’s “spirit of exploration and discovery” to new representations of the periodic table of elements.

Now, in her latest exhibition, Kamen has created a series of pieces that highlight how the creative processes in art and science are interconnected. In “Reveal: The Art of Reimagining Scientific Discovery,” Kamen chronicles her own artistic process while providing a space for self-reflection that enables viewers to see the relationship between science, art, and their own creativity.

The exhibit, on display at the Katzen Art Center at American University, was inspired by the work of Penn professor Dani Bassett and American University professor Perry Zurn, the exhibit’s faculty sponsor. The culmination of three years of work, “Reveal” features collaborations with a wide range of scientists, including philosophers at American University, microscopists at the National Institutes of Health studying SARS-CoV-2 , and researchers in Penn’s Complex Systems Lab and the Addiction, Health, and Adolescence (AHA!) Lab.

Continue reading at Penn Today.

Dani S. Bassett is the J. Peter Skirkanich Professor in the departments of Bioengineering and Electrical and Systems Engineering in the School of Engineering and Applied Science at the University of Pennsylvania. She also has appointments in the Department of Physics and Astronomy in Penn’s School of Arts & Sciences and the departments of Neurology and Psychiatry in the Perelman School of Medicine at Penn.

Rebecca Kamen is a visiting scholar and artist-in-residence in the Department of Physics & Astronomy in Penn’s School of Arts & Sciences.

David Lydon-Staley is an assistant professor in the Annenberg School for Communication at Penn and was formerly a postdoc in the Bassett lab.

Dale Zhou is a Ph.D. candidate in Penn’s Neuroscience Graduate Group.

“Reveal: The Art of Reimagining Scientific Discovery,” presented by the Alper Initiative for Washington Art and curated by Sarah Tanguy, is on display at the American University Museum in Washington, D.C., until Dec. 12.

The exhbition catalog, which includes an essay on “Radicle Curiosity” by Perry Zurn and Dani S. Bassett, can be viewed online.

Annenberg and Penn Bioengineering Research into Communication Citation Bias

Photo Credit: Debby Hudson / Unsplash

Women are frequently under-cited in academia, and the field of communication is no exception, according to research from the Annenberg School for Communication. The study, entitled “Gendered Citation Practices in the Field of Communication,” was published in Annals of the International Communication Association.

A new study from the Addiction, Health, & Adolescence (AHA!) Lab at the Annenberg School for Communication at the University of Pennsylvania found that men are over-cited and women are under-cited in the field of Communication. The researchers’ findings indicate that this problem is most persistent in papers authored by men.

“Despite known limitations in their use as proxies for research quality, we often turn to citations as a way to measure the impact of someone’s research,” says Professor David Lydon-Staley, “so it matters for individual researchers if one group is being consistently under-cited relative to another group. But it also matters for the field in the sense that if people are not citing women as much as men, then we’re building the field on the work of men and not the work of women. Our field should be representative of all of the excellent research that is being undertaken, and not just that of one group.”

The AHA! Lab is led by David Lydon-Staley, Assistant Professor of Communication and former postdoc in the Complex Systems lab of Danielle Bassett, J. Peter Skirkanich Professor in Bioengineering and in Electrical and Systems Engineering in the School of Engineering and Applied Science. Dr. Bassett and Bassett Lab members Dale Zhou and Jennifer Stiso, graduate students in the Perelman School of Medicine, also contributed to the study.

Read “Women are Under-cited and Men are Over-cited in Communication” in Annenberg School for Communication News.

“This is What a Data Scientist Looks Like”

Speakers at the second annual Women in Data Science @ Penn Conference.

Last month, the second annual Women in Data Science (WiDS) @ Penn Conference virtually gathered nearly 500 registrants to participate in a week’s worth of academic and industry talks, live speaker Q&A sessions, and networking opportunities.

Hosted by Penn Engineering, Analytics at WhartonWharton Customer Analytics and Wharton’s Statistics Department, the conference’s theme — “This is What a Data Scientist Looks Like” – emphasized the depth, breadth, and diversity of data science, both in terms of the subjects the field covers and the people who enter it.

Following welcoming remarks from Erika James, Dean of the Wharton School, and Vijay Kumar, Nemirovsky Family Dean of Penn Engineering, the conference began with a keynote address from President of Microsoft US and Wharton alumna Kate Johnson.

Conference sessions continued throughout the week, featuring panels of academic data scientists from around Penn and beyond, industry leaders from IKEA Digital, Facebook and Poshmark, and lightning talks from students speakers who presented their data science research.

All of the conference’s sessions are now available on YouTube and the 2021 WiDS Conference Recap, including a talk titled “How Humans Build Models for the World” by Danielle Bassett, J. Peter Skirkanich Professor in Bioengineering and Electrical and Systems Engineering.

Read more about the conference at Wharton Stories: “How Women in Data Science Rise to the Top.

Originally posted in Penn Engineering Today.

‘As More Women Enter Science, It’s Time to Redefine Mentorship’

 

Danielle Bassett, Ph.D.

Danielle Bassett, J. Peter Skirkanich Professor in the departments of Bioengineering and Electrical and Systems Engineering, investigates how the shape of networks impact the phenomena that arises from them. Much of that research is focused on networks of neurons, and how the different ways they are wired together in different people can influence their mental traits, such as memory or executive function.

Bassett is also interested in networks of people, however, as the shapes of those networks can have a major impact on a society’s traits. Last year, she and her colleagues published a study that investigated the network of citations neuroscience researchers produced in the course of their work, demonstrating a systemic gender bias that left women underrepresented in the literature.

Recently, Bassett spoke with WIRED’s Grace Huckins about the big-picture changes that must take place within academia for it to become truly equitable.

When a group of researchers at NYU Abu Dhabi published a paper in Nature Communications last fall suggesting that young women scientists should seek out men as mentors, the backlash was swift and vociferous. Countless scientists, many of them women, registered their indignation on Twitter—some even penning open letters and their own preprints in response. The original paper had found that female junior scientists who authored papers with male senior scientists saw their papers cited at higher rates. But a number of critics contested the assertion that this result established a link between male mentors and career performance. Scientists routinely coauthor articles with people who are not their mentors, they argued, and citation rates are just one metric of achievement. In response to these criticisms, the authors eventually retracted their paper. (They declined to comment to WIRED.)

But the paper had already stirred up a broader discussion about gender and mentorship in academia. For Danielle Bassett, a professor of bioengineering at the University of Pennsylvania, the methodological concerns that prompted the paper’s retraction were far from its worst sin. She herself has researched citation practices and found that, in neuroscience, papers with male senior authors are cited at a disproportionately high rate—primarily because other male scientists preferentially cite them. To suggest that young women should therefore try to author papers with men is, she believes, a grave error. “That was a problem in assigning blame,” she says. “The onus is on us to create a scientific culture that lets students choose a mentor that’s right for them.”

Continue reading Grace Huckins’s ‘As More Women Enter Science, It’s Time to Redefine Mentorship‘ at WIRED.

Originally posted in Penn Engineering Today.

Studying ‘Hunters and Busybodies,’ Penn and American University Researchers Measure Different Types of Curiosity

by Melissa Pappas

Knowledge networks were created as participants browsed Wikipedia, where pages became nodes and relatedness between pages became edges. Two diverging styles emerged — “the busybody” and “the hunter.” (Illustrations by Melissa Pappas)

Curiosity has been found to play a role in our learning and emotional well-being, but due to the open-ended nature of how curiosity is actually practiced, measuring it is challenging. Psychological studies have attempted to gauge participants’ curiosity through their engagement in specific activities, such as asking questions, playing trivia games, and gossiping. However, such methods focus on quantifying a person’s curiosity rather than understanding the different ways it can be expressed.

Efforts to better understand what curiosity actually looks like for different people have underappreciated roots in the field of philosophy. Varying styles have been described with loose archetypes, like “hunter” and “busybody” — evocative, but hard to objectively measure when it comes to studying how people collect new information.

A new study led by researchers at the University of Pennsylvania’s School of Engineering and Applied Science, the Annenberg School for Communication, and the Department of Philosophy and Religion at American University, uses Wikipedia browsing as a method for describing curiosity styles. Using a branch of mathematics known as graph theory, their analysis of curiosity opens doors for using it as a tool to improve learning and life satisfaction.

The interdisciplinary study, published in Nature Human Behavior, was undertaken by Danielle Bassett, J. Peter Skirkanich Professor in Penn Engineering’s Departments of Bioengineering and Electrical and Systems Engineering, David Lydon-Staley, then a post-doctoral fellow in her lab, now an assistant professor in the Annenberg School of Communication, two members of Bassett’s Complex Systems Lab, graduate student Dale Zhou and postdoctoral fellow Ann Sizemore Blevins, and Perry Zurn, assistant professor from American University’s Department of Philosophy.

“The reason this paper exists is because of the participation of many people from different fields,” says Lydon-Staley. “Perry has been researching curiosity in novel ways that show the spectrum of curious practice and Dani has been using networks to describe form and function in many different systems. My background in human behavior allowed me to design and conduct a study linking the styles of curiosity to a measurable activity: Wikipedia searches.”

Zurn’s research on how different people express curiosity provided a framework for the study.

Read the full story in Penn Engineering Today.

Danielle Bassett and Jason Burdick are Among World’s Most Highly Cited Researchers

Danielle Bassett and Jason Burdick
Danielle Bassett and Jason Burdick

The nature of scientific progress is often summarized by the Isaac Newton quotation, “If I have seen further it is by standing on the shoulders of giants.” Each new study draws on dozens of earlier ones, forming a chain of knowledge stretching back to Newton and the scientific giants his work referenced.

Scientific publishing and referencing has become more formal since Newton’s time, with databases of citations allowing for sophisticated quantitative analyses of that flow of information between researchers.

The Institute for Scientific Information and the Web of Science Group provide a yearly snapshot of this flow, publishing a list of the researchers who are in the top 1 percent of their respective fields when it comes to the number of times their work has been cited.

Danielle Bassett, J. Peter Skirkanich Professor in the departments of Bioengineering and Electrical and Systems Engineering, and Jason Burdick, Robert D. Bent Professor in the department of Bioengineering, are among the 6,389 researchers named to the 2020 list.

Bassett is a pioneer in the field of network neuroscience, which incorporates elements of mathematics, physics,  biology and systems engineering to better understand how the overall shape of connections between individual neurons influences cognitive traits. Burdick is an expert in tissue engineering and the design of biomaterials for regenerative medicine; by precisely tailoring the microenvironment within these materials, they can influence stem cell differentiation or trigger the release of therapeutics.

Bassett and Burdick were named to the Web of Science’s 2019 Highly Cited Researchers list as well.

Originally posted in Penn Engineering Today.