George Rebane
Rejoice! We have all lived to see the practical solution to one of the most complex and important processes that defines and determines how all critters live and die. It’s called protein folding, and it describes how a very long string of bio-molecules called amino acids first hook up with each other, and then snap into a (minimum energy) shape that enables it to function with other bio-molecules. (more here) These other bio-molecules may be on the surface of or inside various kinds of living cells. And here is how MIT Technology Review describes it –
“A protein is made from a ribbon of amino acids that folds itself up with many complex twists and turns and tangles. This structure determines what it does. And figuring out what proteins do is key to understanding the basic mechanisms of life, when it works and when it doesn’t. Efforts to develop vaccines for covid-19 have focused on the virus’s spike protein, for example. The way the coronavirus snags onto human cells depends on the shape of this protein and the shapes of the proteins on the outsides of those cells. The spike is just one protein among billions across all living things; there are tens of thousands of different types of protein inside the human body alone.”
The basic idea to grasp here is that proteins do their work physically through the way they are shaped with all kinds of complex sticky-out parts that enable it to latch onto other molecules or even ‘destroy’ them. It turns out that all the complex stuff that goes on inside the deep recesses of living things depends on how giant ‘LEGO games’ are assembled and played in very tiny but complex universes.
There exist myriads of different proteins numbering in the, who knows, hundreds of thousands (millions?), with the possibility to assemble gazillions more different shaped proteins that don’t yet exist in nature. And here has been the rub. We can write down and/or derive the chemical structure of a protein in a form that all high school chemistry students were taught with all the N, C, H, O, … atoms hooking up to each other through various combinations. But that only tells us the ‘stretched out’ sequence of the protein’s constituents. However, that’s not how they exist and do their work. When the necessary ingredients for a given protein are put into a ‘soup’, they tend to hook up according to one or more of the possible stretched out versions, and then instantly this long and complex string of atoms folds or bunches up into a very special shape that gives it the ability to do its work.
The very special folded shape (see graphic) is brought about by the folding molecule, like all conformable structures in our universe, seeking its minimum energy state. The energy level of a bunch of connected atoms is determined by their resultant electric field which in turn is determined by the physical configuration of the atoms with respect to each other. And now you can see that for a bio-molecule, with thousands of strung together atoms, there are a lot of possible shapes each with its own very complex electric field and corresponding energy content. Now which of these gazillions of shapes is at the lowest energy level? Or said differently –
“Identifying a protein’s structure is very hard. For most proteins, researchers have the sequence of amino acids in the ribbon but not the contorted shape they fold into. And there are typically an astronomical number of possible shapes for each sequence. Researchers have been wrestling with the problem at least since the 1970s, when Christian Anfinsen won the Nobel prize for showing that sequences determined structure.”
Google’s Deep Mind AI outfit has come up with a humongous deep-learning neural net called AlphaFold that was trained to recognize and analyze hundreds of thousands of known protein molecules. The bottom line here is that when it is given the stretched out structure of the thousands of hooked up atoms, it is able to ‘very quickly’ figure out how the stretched out version folds into a very specific minimum energy form with various sticky-out parts, dents, and deep holes that make it in/compatible with certain other bio-molecular structures in a critter, plant, or organic broth. (more here)
To give you an idea of the breakthrough, in the old days (i.e. yesterday) our fastest computers would wrestle with a given stretchy structure for months or years looking for the absolute minimum energy configuration. (Each configuration requires the computation of millions of unimaginably complex electric field shapes from the physical position of a single feasible configuration of atoms that is allowed by the physics of our universe.) Today AlphaFold has reduced that time ranging from a few hours to a few days. This opens up whole new worlds of bio-molecular design for all kinds of new medicines, energy conversion bio-molecules, foods, materials, … .
And to show how technology is accelerating, there is an even faster algorithm working on a recurrent geometrical network (RGN) that promises to be “a million times faster” than AlphaFold, able to solve the folding problem in seconds. And as all this work is being published, a hundred entrepreneurial efforts will launch, not only to implement RGN based folding tools, but also use them to solve important problems to provide humankind with better healthcare, cleaner environments, cheaper energy, and new foods – all affordable like never before.
There is a lot more to be said about this breakthrough – e.g. development of an entirely new type of bio-computer that is faster and more energy efficient than today’s von Neumann silicon-based computers. And perhaps the intelligent machine that achieves Singularity peerage with humans will be an auto-configuring bio-computer. Now ain’t minimally regulated and taxed capitalism wonderful?


Leave a comment