UC Davis-Based Design2Data Course Provides Research Opportunities to Students Across Nation
- Design2Data is a course-based undergraduate research experience (CURE) that teaches students an enzyme design-build-test workflow.
- In the 10-week course, students build and test single point mutations in an enzyme of interest, investigating how those mutations affect the protein's function and contributing that knowledge to an open access database.
- Since its founding in 2019, the course has been rolled out to 25 institutions across the nation.
The future is proteins. Medicine, biotechnology, agriculture and other fields have benefited from scientists harnessing protein engineering to improve therapeutics, biofuels, crop yields and much more.
“The importance of understanding the way that proteins work is difficult to overstate,” said Ashley Vater, an instructional designer in Associate Professor of Chemistry Justin Siegel’s lab. “In every sector that we humans care about, there’s space to make real progress through protein engineering.”
The Siegel Lab, housed in the UC Davis Genome Center, focuses on optimizing enzymes, proteins that act as nature’s catalysts, to solve the world’s most pressing problems.
But proteins are diverse and their structures can be complex. The human body alone produces around 20,000 different kinds of proteins. And there are innumerable possible variants of the sequence space of even small proteins. Extrapolate beyond that to different species and the potentialities become mind-boggling.
“Exploring the functional effects of enzyme variants is a project that lends itself to many hands and many minds,” said Vater. “The data only really gets interesting when we have a lot of it.”
That’s why the Siegel Lab, Vater and a core team of faculty are enlisting the help of hundreds of undergraduates at UC Davis and across the country. They created Design2Data, a course-based undergraduate research experience (CURE) that teaches students an enzyme design-build-test workflow and allows them to contribute knowledge to an open access database.
“The goal of the project is to collect enough data to train functionally predictive machine learning algorithms,” said Vater. “Better computational tools in this space would really accelerate our ability to reach solutions for the big challenges we face in food, energy and health.”
The course also aims to provide undergraduate students with the opportunity to advance scientific knowledge while developing their social capital, said Vater, who earned two degrees from UC Davis: a Bachelor of Science in biological sciences in 2012 and Master of Science in medical microbiology in 2017.
“Students should be doing something that means something, that’s really useful instead of just consuming knowledge,” said Vater. “I think they should be producing knowledge while they’re learning. We’ve seen over and over again that undergrads are totally capable of this.”
The Design2Data workflow
Since the founding of Design2Data in 2019, Vater and her colleagues have rolled out the CURE to 25 institutions across the nation, with more than 800 students participating so far and enrollment rising rapidly.
“We have faculty from all over the country at different types of institutions, including community colleges and liberal arts schools,” she said. “We really try to connect with institutions that have limited research resources, meaning it might be especially hard for their students to get research experience.”
In the 10-week course, students build and test single point mutations in an enzyme of interest.
“Similar datasets that exist in the world are limited in terms of data quality or data set size,” Vater said. “We think that if we can get that detailed functional data for lots of variants, we can start to understand that relationship between enzyme structure and enzyme function.”
At the start of the Design2Data course, students model mutations using accessible computational tools like Foldit, a collaborative software program that allows the visualization of proteins.
“They look at a representation of the protein in 3D space on a computer and they find a place on the protein that they want to change,” said Vater. “They can try modeling different mutations and the software will predict each mutation's thermodynamic folding favorability.”
In other words, does the mutation enhance or hinder the enzyme’s structural integrity and thus its functionality?
“Foldit is great at leveraging the user’s human spatial intuition and almost certainly something will catch a student eye, inspiring them to decide, ‘OK, this is the mutation I want to study,’” said Vater. “As soon as they’ve done that, we order the little snippet of DNA that encodes for that mutation.”
After the students build a sequence-verified mutant with common molecular biology techniques, they use Escherichia coli for protein production and standard chromatography methods to purify their variant enzymes. The purified enzymes are then tested for functionality with assays that answer the questions: How efficiently does the new variant do its job? And how much heat can be applied before it loses its activity?
“So we capture both of those engineering-relevant functional data and then the students add them to the Design2Data database.” said Vater.
A catalyst for growth
Thus far, students in the Design2Data program have focused their investigations on B-Glucosidase B-enzyme (BglB). Affectionately called “bagel,” this enzyme has the potential to play a key role in the production of biofuels.
“It’s also really amenable to class and to novice hands, which is not the case for all enzymes,” said Vater.
Some students and member institutions are starting to explore whether the Design2Data workflow can be applied to other enzymes.
“We really think, research-wise, there’s a lot of value in starting to branch out and investigate these other families of enzymes,” said Vater. “And it would be cool for the students to engage with systems that have different commercial, environmental or health-related applications.”
As more institutions join the Design2Data network, Vater is excited for the future. By the end of 2022, she projected that the Design2Data program will have reached close to 2,000 students.
“If we maintain our momentum, we think we can serve over 20,000 students in the next five years,” she said. “We know research experiences change student lives; we think there’s huge potential to make a difference with this program.”
You can learn more about the Design2Data program here.