data science – Penn Bioengineering Blog

*Lasya Sreepada, Ph.D. student in Bioengineering*

Lasya Sreepada has always been fascinated by the brain and the underlying biology that shapes how people develop and age. “My curiosity traces back to observing differences between myself and my sister,” says Sreepada, a Ph.D. candidate in Bioengineering whose research unites efforts across Penn Medicine and Penn Engineering. “We grew up in the same environment but had remarkably different personalities, which led me to question what drove these differences and which brought me to the brain.”

Her academic journey began by applying medical imaging to understand how brain injuries sustained by professional athletes or military veterans impact their brain structure and chemistry over time. She became curious about how neurotrauma impacts aging and degeneration in the long term. Now, she leverages large, multimodal datasets to investigate neurodegenerative disease, with a particular focus on Alzheimer’s.

Read the full story in Penn Today.

Lasya Sreepada is a Bioengineering Ph.D. student at the Bioinformatics in Neurodegenerative Disease (BiND) Lab at Penn, advised by Corey McMillan and Dave Wolk, both Associate Professors in Neurology and members of the Bioengineering Graduate Group.

March 18, 2024March 18, 2024

Beyond Bias: The Annual Women in Data Science Conference Unites Women across Penn

by Ian Scheffler

Lasya Sreepada, a doctoral student in Bioengineering (BE), addresses the crowd. (Image: Lamont Abrams)

In Invisible Women: Data Bias in a World Designed for Men, Caroline Criado Perez notes that the default perspective for virtually all data collection and analysis is male. (Hence crash test dummies being designed to mimic male bodies, air conditioning systems relying on a model of the male metabolism, and women’s unique heart attack symptoms other than chest pain — like nausea and back pain — often going unrecognized, even by women experiencing them.)

Nearly a decade ago, a group of women at Stanford decided to address this issue by convening a one-day technical conference on data science; that meeting has now grown into a worldwide movement, with hundreds of sister conferences each year — a tradition in which Penn Engineering is proud to take part.

By 2030, Women in Data Science (WiDS), the non-profit that spun out of that early meeting, hopes to achieve 30% representation of women in data science globally. For Susan Davidson, Weiss Professor in Computer and Information Science (CIS) and a co-chair of the annual WiDS @ Penn conference, the benefits of participating in WiDS go beyond just networking. “To see women who are successful in your field is extremely encouraging,” says Davidson.

This year, Penn Engineering partnered with Analytics at Wharton (AAW) and the Penn Museum to co-host the fifth annual WiDS @ Penn conference, bringing together dozens of women from across the University and beyond to learn about the latest applications of data science in topics as diverse as online education and health care.

“It gave me the opportunity to not only show others what it means to be a data scientist,” says Lasya Sreepada, a doctoral student in Bioengineering, who presented her work studying early-onset Alzheimer’s disease using large data sets, “but also what it means to be a woman applying data science to integrate multiple disciplines spanning neuroscience, genomics and radiology.”

Penn Engineering students from all levels of their academic careers participated, from Aashika Vishwanath, a sophomore in CIS, president of the Wharton Undergraduate Data Analytics Club and senior data science consultant at Wharton Analytics Fellows, who shared her work developing an AI-powered teaching assistant, to Betty Xu, a master’s student in Electrical and Systems Engineering, who collaborated with the Wharton Neuroscience Initiative to study financial decision making. “Data can help us know the unknown in every field,” says Xu. “You can be a great data scientist no matter your background.”

Read the full story in Penn Engineering Today.

May 31, 2023May 30, 2023

Why is Machine Learning Trending in Medical Research but not in Our Doctor’s Offices?

by Melissa Pappas

Illustration of a robot in a white room with medical equipment. Machine learning (ML) programs computers to learn the way we do – through the continual assessment of data and identification of patterns based on past outcomes. ML can quickly pick out trends in big datasets, operate with little to no human interaction and improve its predictions over time. Due to these abilities, it is rapidly finding its way into medical research.

People with breast cancer may soon be diagnosed through ML faster than through a biopsy. Those suffering from depression might be able to predict mood changes through smart phone recordings of daily activities such as the time they wake up and amount of time they spend exercising. ML may also help paralyzed people regain autonomy using prosthetics controlled by patterns identified in brain scan data. ML research promises these and many other possibilities to help people lead healthier lives.

But while the number of ML studies grow, the actual use of it in doctors’ offices has not expanded much past simple functions such as converting voice to text for notetaking.

The limitations lie in medical research’s small sample sizes and unique datasets. This small data makes it hard for machines to identify meaningful patterns. The more data, the more accuracy in ML diagnoses and predictions. For many diagnostic uses, massive numbers of subjects in the thousands would be needed, but most studies use smaller numbers in the dozens of subjects.

But there are ways to find significant results from small datasets if you know how to manipulate the numbers. Running statistical tests over and over again with different subsets of your data can indicate significance in a dataset that in reality may be just random outliers.

This tactic, known as P-hacking or feature hacking in ML, leads to the creation of predictive models that are too limited to be useful in the real world. What looks good on paper doesn’t translate to a doctor’s ability to diagnose or treat us.

These statistical mistakes, oftentimes done unknowingly, can lead to dangerous conclusions.

To help scientists avoid these mistakes and push ML applications forward, Konrad Kording, Nathan Francis Mossell University Professor with appointments in the Departments of Bioengineering and Computer and Information Science in Penn Engineering and the Department of Neuroscience at Penn’s Perelman School of Medicine, is leading an aspect of a large, NIH-funded program known as CENTER – Creating an Educational Nexus for Training in Experimental Rigor. Kording will lead Penn’s cohort by creating the Community for Rigor which will provide open-access resources on conducting sound science. Members of this inclusive scientific community will be able to engage with ML simulations and discussion-based courses.

“The reason for the lack of ML in real-world scenarios is due to statistical misuse rather than the limitations of the tool itself,” says Kording. “If a study publishes a claim that seems too good to be true, it usually is, and many times we can track that back to their use of statistics.”

Such studies that make their way into peer-reviewed journals contribute to misinformation and mistrust in science and are more common than one might expect.

Read the full story in Penn Engineering Today.

March 29, 2023March 29, 2023

Gregory Bowman Appointed Penn Integrates Knowledge University Professor

by Ron Ozio

Greg Bowman — Gregory Bowman, the Louis Heyman University Professor, has joint appointments in the Department of Biochemistry and Biophysics in the Perelman School of Medicine and the Department of Bioengineering in the School of Engineering and Applied Science. (Image: Courtesy of School of Engineering and Applied Sciences)

Gregory R. Bowman, a pioneer of biophysics and data science, has been named a Penn Integrates Knowledge University Professor at the University of Pennsylvania. The announcement was made today by President Liz Magill and Interim Provost Beth A. Winkelstein.

Bowman holds the Louis Heyman University Professorship, with joint appointments in the Department of Biochemistry and Biophysics in the Perelman School of Medicine and the Department of Bioengineering in the School of Engineering and Applied Science.

His research aims to combat global health threats such as COVID-19 and Alzheimer’s disease by better understanding how proteins function and malfunction, especially through new computational and experimental methods that map protein structures. This understanding of protein dynamics can lead to effective new treatments for even the most seemingly resistant diseases.

“Delivering the right treatment to the right person at the right time is vital to sustaining—and saving—lives,” Magill said. “Greg Bowman’s novel work holds enormous promise and potential to advance new forms of personalized medicine, an area of considerable strength for Penn. A gifted researcher and consummate collaborator, we are delighted to count him among our distinguished PIK University Professors.”

Bowman came to Penn from the Washington University School of Medicine’s Department of Biochemistry and Molecular Biophysics, where he served on the faculty since 2014. He previously completed a three-year postdoctoral fellowship at the University of California, Berkeley.

Bowman’s research utilizes high-performance supercomputers for simulations that can better explain how mutations and disease change a protein’s functions. These simulations are enabled in part through the innovative Folding@home project, which Bowman directs. Folding@home empowers anyone with a computer to run simulations alongside a consortium of universities, with more than 200,000 participants worldwide.

His research has been supported by the National Science Foundation, National Institutes of Health, National Institute on Aging, and Packard Foundation, among others, and he has received a CAREER Award from the NSF, Career Award at the Scientific Interface from the Burroughs Wellcome Fund, and Thomas Kuhn Paradigm Shift Award from the American Chemical Society. He received a Ph.D. in biophysics from Stanford University and a B.S. (summa cum laude) in computer science, with a minor in biomedical engineering, from Cornell University.

“Greg Bowman’s highly innovative work,” Winkelstein said, “exemplifies the power of our interdisciplinary mission at Penn. He brings together supercomputers, biophysics, and biochemistry to make a vital impact on public health. This brilliant fusion of methods—in the service of improving people’s lives around the world—will be a tremendous model for the research of our faculty, students, and postdocs in the years ahead.”

The Penn Integrates Knowledge program is a University-wide initiative to recruit exceptional faculty members whose research and teaching exemplify the integration of knowledge across disciplines and who are appointed in at least two schools at Penn.

The Louis Heyman University Professorship is a gift of Stephen J. Heyman, a 1959 graduate of the Wharton School, and his wife, Barbara Heyman, in honor of Stephen Heyman’s uncle. Stephen Heyman is a University Emeritus Trustee and member of the School of Nursing Board of Advisors. He is Managing Partner at Nadel and Gussman LLC in Tulsa, Oklahoma.

This story originally appeared in Penn Today.

Dr. Bowman is Penn Bioengineering’s third PIK Professor after Kevin Johnson and Konrad Kording. See the full list of University PIK Professors here.

September 20, 2022

Training the Next Generation of Scientists on Soft Materials, Machine Learning and Science Policy

by Melissa Pappas

Developing new soft materials requires new data-driven research techniques, such as autonomous experimentation. Data regarding nanometer-scale material structure, taken by X-ray measurements at a synchrotron, can be fed into an algorithm that identifies the most relevant features, represented here as red dots. The algorithm then determines the optimum conditions for the next set of measurements and directs their execution without human intervention. Brookhaven National Laboratory’s Kevin Yager, who helped develop this technique, will co-teach a course on it as part of a new Penn project on Data Driven Soft Materials Research.

The National Science Foundation’s Research Traineeship Program aims to support graduate students, educate the STEM leaders of tomorrow and strengthen the national research infrastructure. The program’s latest series of grants are going toward university programs focused on artificial intelligence and quantum information science and engineering – two areas of high priority in academia, industry and government.

Chinedum Osuji, Eduardo D. Glandt Presidential Professor and Chair of the Department of Chemical and Biomolecular Engineering (CBE), has received one of these grants to apply data science and machine learning to the field of soft materials. The grant will provide five years of support and a total of $3 million for a new Penn project on Data Driven Soft Materials Research.

Osuji will work with co-PIs Russell Composto, Professor and Howell Family Faculty Fellow in Materials Science and Engineering, Bioengineering, and in CBE, Zahra Fakhraai, Associate Professor of Chemistry in Penn’s School of Arts & Sciences (SAS) with a secondary appointment in CBE, Paris Perdikaris, Assistant Professor in Mechanical Engineering and Applied Mechanics, and Andrea Liu, Hepburn Professor of Physics and Astronomy in SAS, all of whom will help run the program and provide the connections between the multiple fields of study where its students will train.

These and other affiliated faculty members will work closely with co-PI Kristin Field, who will serve as Program Coordinator and Director of Education.

Read the full story in Penn Engineering Today.

August 9, 2022

Kevin Johnson Appointed Senior Fellow at Penn LDI

Congratulations to Kevin B. Johnson, David L. Cohen University Professor, on his recent appointed as a Senior Fellow in the Leonard Davis Institute of Health Economics at the University of Pennsylvania (Penn LDI). Johnson, an expert in health care innovation and health information technology, holds appointments in Biostatistics, Epidemiology and Informatics in the Perelman School of Medicine and Computer and Information Science in the School of Engineering and Applied Science. He also holds secondary appointments in Bioengineering, Pediatrics, and in the Annenberg School of Communication and is Vice President for Applied Informatics in the University of Pennsylvania Health System.

Penn LDI is Penn’s hub for health care delivery, health policy, and population health, we connect and amplify experts and thought-leaders and train the next generation of researchers. Johnson joins over 500 Fellows from across all of Penn’s schools, the University of Pennsylvania Health System, and the Children’s Hospital of Philadelphia. Johnson brings expertise in Health Care Innovation, Health Information Technology, Medication Adherence, and Social Media to his new fellowship and has extensively studied healthcare informatics with the goal of improving patient care.

Learn more about Penn LDI on their website.

Learn more about Johnson’s research on his personal website.

July 7, 2022July 7, 2022

Kevin Johnson: Informatics Evangelist

by Ebonee Johnson

Kevin Johnson is used to forging his own path in the fields of healthcare and computer science.

A picture of Johnson as a child, from his children’s book “I’m a Biomedical Expert Now!”

If you ask him to locate his niche within these fields, Johnson, David L. Cohen and Penn Integrates Knowledge (PIK) Professor with appointments in Penn Engineering and the Perelman School of Medicine, would say “informatics.” But that doesn’t tell the whole story of the board-certified pediatrician, who has dedicated his career to innovations in how patients’ information is created, documented and shared, all with the goal of improving the quality of healthcare they receive.

Informatics, the study of the structure and behavior of interactions between natural and computational systems, is an umbrella term. Within it, there’s bioinformatics, which applies informatics to biology, and biomedical informatics, which looks at those interactions in the context of healthcare systems. Finally, there is clinical informatics, which further focuses on the settings where healthcare is delivered, and where Johnson squarely places himself.

“But you can just call it ‘informatics,’” says Johnson. “It will be easier.”

He mainly studies how computational systems can improve ambulatory care — sometimes known as outpatient care, or the kind of care hospitals give to patients without admitting them — in real time. If you’ve ever heard your doctor complain about the amount of time it takes them to input the information they get from you during your visit, or wondered why they need to capture this information during the visit in the first place, these are some of the questions Johnson is investigating.

“We’re taking care of patients but we’re getting frustrated by things that we thought these new computers should be able to fix,” says Johnson.” I think there’s a very compelling case for using engineering principles to reimagine electronic health records.”

Read the full story in Penn Engineering Today.

Kevin Johnson is the David L. Cohen University of Pennsylvania Professor in the Departments of Biostatistics, Epidemiology and Informatics and Computer and Information Science. As a Penn Integrates Knowlegde (PIK) University Professor, Johnson also holds appointments in the Departments of Bioengineering and Pediatrics, as well as in the Annenberg School of Communication. Johnson is the Vice President for Applied Informatics for the University of Pennsylvania Health System and has been elected to the American College of Medical Informatics (2004), the Academic Pediatric Society (2010), the National Academy of Medicine (Institute of Medicine) (2010), and the International Association of Health Science Informatics (2021).

May 26, 2022

Streamlining the Health Care Supply Chain

William Danon and Luka Yancopoulos, winners of the 2022 President’s Innovation Prize, will offer a software solution to make the health care supply chain more efficient.

by Brandon Baker

William Danon and Luka Yancopoulos pose in front of College Hall in April 2022. They are co-founders of Grapevine and the winners of the 2022 President’s Innovation Prize.

William Danon and Luka Yancopoulos are best friends. They’re also business partners.

The duo, who received this year’s President’s Innovation Prize (PIP) for Grapevine, met during sophomore year, connected through Yancopoulos’ roommate. As time went on, they did everything together: cooked meals, played basketball, and read and discussed fantasy novels.

“We spent a lot of time together,” Danon says.

It was only natural, then, that when the time came to start an actual venture, they’d do it together.

“They’re like brothers, in a very good way,” says mentor David Meaney of the School of Engineering and Applied Science, who describes their working dynamic as “complementary.” “I think that will serve them well. Most of what we do in faculty is collaborative, and I see elements of that in their partnership. I give them credit for stepping out and doing something unusual and keeping at it.”

How Grapevine came to be

Grapevine is a software solution and professional networking platform that connects small-to-medium-size players in the health care supply chain. It’s a sort of two-pronged solution: It helps institutions like hospital systems connect disjointed operations like procurement and inventory management internally, but also serves as a glue between these institutions and purveyors of medical equipment.

“William and Luka are impact-driven entrepreneurs whose collaborative synergies will take them far,” says Penn Interim President Wendell Pritchett. “The software provided by Grapevine is poised to reinvent how the health care industry buys and sells medical supplies and services and, truly, could not come at a timelier moment.”

The company is the evolution of a project they began at the onset of the COVID-19 pandemic, called Pandemic Relief Supply, which delivered $20 million of health care supplies to frontline workers.

“My mom was a nurse practitioner at New York Presbyterian Hospital, the largest hospital in the United States, and she was coming home with horror stories,” recalls Yancopoulos. “In surgery or the ER, a surgeon had to put on a garbage bag because they didn’t have a gown. And they gave her one mask to use for the rest of the month, and I’m seeing on the news, ‘Don’t wear a mask for more than three days.’”

This is where Yancopoulos and Danon first developed an interest in the health care supply chain. Using a database Penn allows students access to that maps the import of any good in the country, they did keyword matching to identify instreams of different goods and handed off findings to New York Presbyterian procurement staff. When McKesson, the largest provider of health care products and services in the U.S., took notice of what they were doing and reached out, they realized they were onto something. In response to their success, they started a company called Pandemic Relief Supply to distribute reliable medical supplies, including items like medical-grade masks and gloves, to frontline workers in the healthcare space.

As time passed, that project evolved into something larger: Grapevine.

A mock-up screenshot of a business profile on the Grapevine professional networking platform. (Image: William Danon)

In short, Grapevine’s software creates a professional networking platform to resolve miscommunications between suppliers and buyers, as well as adds a layer of transparency between interactants. Suppliers on the platform display real-time data about their inventory and shipping process, with timestamps; this prohibits companies from cherry-picking data or making false claims and creates a more health-care-supply-specific space for companies to interact than, say, LinkedIn.

“Primarily, the first step is we want people to use it internally, and streamline operations, and then through that centralized operational data, you can push that externally and that’s where [Grapevine] becomes a connector,” explains Danon. “Because when you’re choosing to connect with someone, the reason you can do so way more efficiently or quickly, is that data is actual operational data.”

To accomplish this level of transparency, the beginnings of Grapevine involved lots of legwork. Last year, the duo moved to Los Angeles to take stock of what suppliers existed where, and how reliable they were. They realized that many suppliers existed around Los Angeles because of port access; many medical supplies are imported from Asia. Their time in LA made the problem feel even more tangible, they agree.

“We were able to see people were doing outdated processes—manual processes—because there’s no other option,” Danon says. “So, we said, ‘Let’s get out there and do some work to be digital and technologically innovative.”

Read the full story in Penn Today.

N.B.: Yancopolous’s senior design team created “Harvest” for their capstone project in Bioengineering, building on the existing Grapevine software package. Read Harvest’s abstract and view their final presentation on the BE Labs website.

April 19, 2022

Kevin Johnson Discusses the Future of the Electronic Health Record

Kevin B. Johnson, M.D., M.S., was featured in Cincinnati Children’s Hospital’s “Envisioning Our Future for Children” speaker series, discussing “the evolution of the EHR and its future directions.” An electronic health record, or EHR, is a digital record of a patient’s chart, recording health information and data, coordinating orders, tracking results, and providing patient support. Johnson “predicts a new wave of transformation in digital health technologies that could make rapid progress” in several areas of medicine, including reducing cost and improving patience outcomes. Johnson is Vice President for Applied Informatics at the University of Pennsylvania Health System and the David L. Cohen University Professor with appointments in Biostatistics, Epidemiology and Informatics and Computer and Information Science and secondary appointments in the Annenberg School for Communication, Pediatrics, and Bioengineering.

Read “What Will It Take to Make EHR a Partner Instead of a Burden?” in the Cincinnati Children’s Hospital Research Horizons blog. View Johnson’s seminar talk on the Envisioning Our Future website.

March 3, 2022March 23, 2022

Konrad Kording Appointed Co-Director the CIFAR Learning in Machines & Brains Program

Konrad Kording, PhD (Photo by Eric Sucar)

Konrad Kording, Nathan Francis Mossell University Professor in Bioengineering, Neuroscience, and Computer and Information Sciences, was appointed the Co-Director of the CIFAR Program in Learning in Machines & Brains. The appointment will start April 1, 2022.

CIFAR is a global research organization that convenes extraordinary minds to address the most important questions facing science and humanity. CIFAR was founded in 1982 and now includes over 400 interdisciplinary fellows and scholars, representing over 130 institutions and 22 countries. CIFAR supports research at all levels of development in areas ranging from Artificial Intelligence and child and brain development, to astrophysics and quantum computing. The program in Learning in Machines & Brains brings together international scientists to examine “how artificial neural networks could be inspired by the human brain, and developing the powerful technique of deep learning.” Scientists, industry experts, and policymakers in the program are working to understand the computational and mathematical principles behind learning, whether in brains or in machines, in order to understand human intelligence and improve the engineering of machine learning. As Co-Director, Kording will oversee the collective intellectual development of the LMB program which includes over 30 Fellows, Advisors, and Global Scholars. The program is also co-directed by Yoshua Benigo, the Canada CIFAR AI Chair and Professor in Computer Science and Operations Research at Université de Montréal.

Kording, a Penn Integrates Knowledge (PIK) Professor, was previously named an associate fellow of CIFAR in 2017. Kording’s groundbreaking interdisciplinary research uses data science to advance a broad range of topics that include understanding brain function, improving personalized medicine, collaborating with clinicians to diagnose diseases based on mobile phone data and even understanding the careers of professors. Across many areas of biomedical research, his group analyzes large datasets to test new models and thus get closer to an understanding of complex problems in bioengineering, neuroscience and beyond.

Visit Kording’s lab website and CIFAR profile page to learn more about his work in neuroscience, data science, and deep learning.

Tag: data science

Who, What, Why: Lasya Sreepada on Decoding Alzheimer’s Disease

Beyond Bias: The Annual Women in Data Science Conference Unites Women across Penn

Why is Machine Learning Trending in Medical Research but not in Our Doctor’s Offices?

Gregory Bowman Appointed Penn Integrates Knowledge University Professor

Training the Next Generation of Scientists on Soft Materials, Machine Learning and Science Policy

Kevin Johnson Appointed Senior Fellow at Penn LDI

Kevin Johnson: Informatics Evangelist

Streamlining the Health Care Supply Chain

William Danon and Luka Yancopoulos, winners of the 2022 President’s Innovation Prize, will offer a software solution to make the health care supply chain more efficient.

How Grapevine came to be

Kevin Johnson Discusses the Future of the Electronic Health Record

Konrad Kording Appointed Co-Director the CIFAR Learning in Machines & Brains Program