Tag: images

Solving protein structures

This explains a lot why pharma companies are so terrible at coming up with new drugs.

There is perhaps no better example of this than protein structure prediction, a problem that is very close to these companies’ core interest (along with docking), but on which they have spent virtually no resources. The little research on these problems done at pharmas is almost never methodological in nature, instead being narrowly focused on individual drug discovery programs. While the latter is important and obviously contributes to their bottom line, much like similar research done at tech companies, the lack of broadly minded basic research may have robbed biology of decades of progress, and contributed to the ossification of these companies software and machine learning expertise

2020-11-30: Nature perspective on AlphaFold 1

DeepMind has made a gargantuan leap in solving one of biology’s grandest challenges — determining a protein’s 3D shape from its amino-acid sequence. “This is a big deal. In some sense the problem is solved.”

Perspective by someone in the field

Which brings me to what I think is the most exciting opportunity of all: the prospect of building a structural systems biology. In almost all forms of systems biology practiced today, from the careful and quantitative modeling of the dynamics of a small cohort of proteins to the quasi-qualitative systems-wide models that rely on highly simplified representations, structure rarely plays a role. This is unfortunate because structure is the common currency through which everything in biology gets integrated, both in terms of macromolecular chemistries, i.e., proteins, nucleic acids, lipids, etc, but also in terms of the cell’s functional domains, i.e., its information processing circuitry, its morphology, and its motility. A structural systems biology would take this seriously, deriving the rate constants of enzymatic and metabolic reactions, protein-protein binding affinities, and protein-DNA interactions all from structural models. We don’t yet know how much easier, if at all, it will be to predict these types of quantities from structure than from sequence—we need to put the dogma of “structure determines function” to the test. Even if the dogma were to fail in some instances, which it almost certainly will, partial success will open up new avenues.

2021-07-23: AlphaFold 2

DeepMind has used its AI to predict the shapes of nearly every protein in the human body, as well as the shapes of 100Ks of other proteins found in 20 of the most widely studied organisms, including yeast, fruit flies, and mice. So far the trove consists of 350k newly predicted protein structures. DeepMind says it will predict and release the structures for more than 100m more in the next few months—more or less all proteins known to science. In the new version of AlphaFold, predictions come with a confidence score that the tool uses to flag how close it thinks each predicted shape is to the real thing. Using this measure, DeepMind found that AlphaFold predicted shapes for 36% of human proteins with an accuracy that is correct down to the level of individual atoms. Previously, after decades of work, only 17% of the proteins in the human body have had their structures identified in the lab. Drug discovery is all about those biological effects – what else could it be concerned with? And these are higher-order things than just the naked protein structure, as valuable as that can be. Remember, our failure rate in the clinic is around 90% overall, and none of those failures were due to lack of a good protein structure. They were caused by much harder problems: what those proteins actually do in a living cell, how those functions differ in health and disease, how they differ between different sorts of human patients and between humans in general and the animal models that were used to develop the compounds, what other protein targets the drug candidate might have hit and the downstream effects (usually undesirable) that those kicked off, and on and on. So structural biology has been greatly advanced by these new tools. But it has not been outmoded, replaced, or rendered irrelevant. It’s more relevant than ever, and now we can get down to even bigger questions with it.

2022-04-12: Protein complexes

ColabFold later incorporated the ability to predict complexes. And in October 2021, DeepMind released an update called AlphaFold-Multimer that was specifically trained on protein complexes, unlike its predecessor. It predicted around 70% of the known protein–protein interactions.
Elofsson’s team used AlphaFold to predict the structures of 65k human protein pairs that were suspected to interact on the basis of experimental data. And a team led by Baker used AlphaFold and RoseTTAFold to model interactions between nearly every pair of proteins encoded by yeast, identifying more than 100 previously unknown complexes. Such screens are just starting points. They do a good job of predicting some protein pairings, particularly those that are stable, but struggle to identify more transient interactions. “Because it looks nice doesn’t mean it is correct. You need some experimental data that show you’re right.”
Attempts to apply AlphaFold to various mutations that disrupt a protein’s natural structure, including one linked to early breast cancer, have confirmed that the software is not equipped to predict the consequences of new mutations in proteins, since there are no evolutionarily-related sequences to examine.
The AlphaFold team is now thinking about how a neural network could be designed to deal with new mutations. This would require the network to better predict how a protein goes from its unfolded to its folded state. That would probably need software that relies only on what it has learnt about protein physics to predict structures. “One thing we are interested in is making predictions from single sequences without using evolutionary information. That’s a key problem that does remain open.”
AlphaFold-inspired tools could be used to model not just individual proteins and complexes, but entire organelles or even cells down to the level of individual protein molecules. “This is the dream we will follow for the next decades.”

2022-07-28: AlphaFold goes from 350k to 214m predictions.

Researchers have used AlphaFold to predict the structures of 214m proteins from 1m species, covering nearly every known protein on the planet. According to EMBL-EBI, around 35% of the 214m predictions are deemed highly accurate, which means they are as good as experimentally determined structures. Another 45% were deemed confident enough to rely on for many applications. DeepMind has committed to supporting the database for the long haul, and he could see updates occurring annually.

2022-08-03: AlphaFold is open source with no commercial restrictions. What is the end game for Deepmind?

DeepMind has made policy decisions that have played a significant part in the transformation in structural biology. This includes its decision last July to make the code underlying AlphaFold open source, so that anyone can use the tool. Earlier this year, the company went further and lifted a restriction that hampered some commercial uses of the program. It has also helped to establish, and is financially supporting, the AlphaFold database maintained with EMBL-EBI. DeepMind deserves to be commended for this commitment to open science.

2022-11-02: Meta enters the fold with a large language model. The amazing generality of language models continues.

ESMFold isn’t quite as accurate as AlphaFold, but it is 60x faster at predicting structures. “What this means is that we can scale structure prediction to much larger databases.”

As a test case, they decided to wield their model on a database of bulk-sequenced ‘metagenomic’ DNA from environmental sources including soil, seawater, the human gut, skin and other microbial habitats. The vast majority of the DNA entries — which encode potential proteins — come from organisms that have never been cultured and are unknown to science. The team predicted the structures of 617m proteins. Of these 617m predictions, the model deemed 33% to be high quality. Millions of these structures are entirely novel, and unlike anything in databases of protein structures determined experimentally or in the AlphaFold database of predictions from known organisms. A good chunk of the AlphaFold database is made of structures that are nearly identical to each other, and ‘metagenomic’ databases “should cover a large part of the previously unseen protein universe”.

In terms of what % of protein space has been covered by these models, estimates vary widely. But it’s possible that life itself has explored all of protein space. If we take a median estimate of 1030 proteins, and 108 with structure, we have a long way to go.

To examine how much of sequence space could have been explored, it is simplest to make upper and lower limit estimates for the number of unique amino acid sequences produced since the origin of life. Considering the upper limit, it is clear that bacteria dominate the planet in terms of the product of the number of cells (1030) multiplied by the number of genes in each genome (104). Let us assume that every single gene in this total of 1034 is unique and that evolution has been working on these genes for 4 Ga completely changing each gene to some other unique, new gene every single year. This gives an extreme upper limit of 4×1043 different amino acid sequences explored since the origin of life. The contribution to this number of sequences by viral and eukaryotic genomes is difficult to estimate but it is very unlikely to be orders of magnitude greater than the 4×1043 sequences from bacteria. If their contribution is similar or smaller, then it can be ignored in our rough calculation. A lower limit to the number of sequences explored is more difficult to estimate but it has been estimated that there are 109 different bacterial species on Earth. If we assume that each species has a unique complement of 103 sequences (an underestimate) and that only 1 sequence has changed per species per generation (a reasonable estimate based upon analysis of mutation rates in bacteria), and that the generation time is 1 year (a considerable underestimate for many modern bacteria, but perhaps reasonable for an ancient organism or one growing slowly in a poor environment), then we arrive at a figure of 4×1021 different protein sequences tested since the origin of life.

Although the oft-quoted 10130 size of sequence space is far above these limits, the other more plausible estimates for the size of sequence space, particularly with limited amino acid diversity or reduced length, are near to or within these 2 limits. Considering the upper limit, all sequences containing 20, 8 and 3 types of amino acids have been explored if the chains are 33, 50 and 100 amino acids in length, respectively. Considering the lower limit, then virtually all chains of length 33 and 50 amino acids containing 5 or 3 types of amino acid, respectively, could have been explored. (The exploration of longer chains of 100 amino acids with only 2 types of residue is obviously much less complete but it is not a negligible fraction of the total.) Therefore it is entirely feasible that for all practical (i.e. functional and structural) purposes, protein sequence space has been fully explored during the course of evolution of life on Earth (perhaps even before the appearance of eukaryotes).


2022-11-26: An open source reimplementation of AlphaFold does even better.

OpenFold is trained from scratch. Compared to AlphaFold2, OpenFold runs on proteins that are 1.7x larger, runs 2x as fast on short proteins, and is slightly more accurate. As more people can help drive this technology, we’ll get more and better discoveries.

2023-07-03: Foldseek

Sequence searches are fast, like searching a hard drive for a file name. But they often miss good matches because proteins with similar shapes can have vastly different sequences. Structure-based search methods look for shapes instead of sequences, but this can take thousands of times longer, because it’s computationally difficult to compare complex 3D objects. With Foldseek, researchers got the best of both worlds: the software represents a protein’s shape as a string of letters — a ‘structural alphabet’ — thereby offering the sensitivity of shape-based searches but at the speed of sequence-based ones. Foldseek outperformed 2 popular structure-based search tools, TM-align and Dali — performing 24% and 8% better, respectively — and 35k times and 20k times faster. Compared with a structural-alphabet-based tool called CLE-SW, Foldseek was 23% better, and 11x as fast

2023-10-12: Create vaccines for predicted mutations

EVEscape is an impressive SARS-CoV-2 soothsayer. 50% of the mutations the model predicted in a region of the cell-invading spike protein most prone to change have already been observed in real-world SARS-CoV-2 variants, a figure that should grow as the virus continues to evolve. The team used the model to create a set of potential sequences for the SARS-CoV-2 spike protein, some containing as many as 46 mutations from the ancestral strain, with the hope of anticipating the virus’s future evolution and contributing to the development of experimental vaccines.

The model isn’t limited to SARS-CoV-2. It could also predict the evolution of HIV, influenza, Nipah and the virus that causes Lassa haemorrhagic fever. When a new virus with pandemic potential pops up, the team hopes to be ready with predictions for its evolution — and perhaps even vaccines based on those predictions.

EV Charging

Perhaps even more important than how much electricity EVs would consume is the question of when it would be consumed. We based the above estimates on optimal, off-peak charging patterns. If instead most EVs were to be charged in the afternoon, the electricity grid would need more generation capacity to avoid outages. While EVs might increase the amount of electricity the US consumes, the investment required to accommodate them may be smaller than it appears. Many regions already have sufficient generation capacity if vehicles are charged during off-peak hours. The energy storage on board EVs could provide the flexibility needed to shift charging times and help grid operators better manage the supply and demand of electricity.

2021-02-09: The US doesn’t have a charging standard. This is insane. Of course it means that Tesla becomes the standard.

2022-02-08: EV uptake simulation as a function of charging infrastructure. Pretty dumb simulation as it predicts a decline in EV sales.

50% of adults who are aware of electric vehicles say they are unlikely to seriously consider purchasing one. Consumers hesitant to make the switch cite concerns such as the high purchase price, limited driving range and lack of sufficient charging infrastructure.

Using a model that is a stylized portrayal of the US auto market, we’re able to simulate the impact of policies intended to overcome these concerns about EVs. Each scenario assumes a limited number of vehicle technologies are available to consumers; the number of cars on the road remains constant; new powertrains are supported by targeted advertising campaigns to raise awareness.

2022-10-14: Shell is trying to convert their gas stations to electric, but are not price competitive. A Tesla Model 3 has a max battery of 82 kwh, which would cost £23 at the average rate, not £35. And much much cheaper at home. In a world where every parking spot can become a charging spot (why not?), this business plan isn’t going to work.

With 46k stations in 80 countries, Shell is the world’s biggest gasoline retailer. The Fulham station is one of several prototypes it’s planning as more cars shift to battery power, aiming to get feedback on what works while laying the groundwork to hit a target of net-zero emissions by 2050. Charging can be done more or less anywhere there’s a plug, so the issue is one that the oil giants, regional chains, and independents that run the world’s 770k filling stations will confront in the coming decades. What’s the value of their real estate in cities and on highways worldwide? Will people still show up if recharging takes 30 minutes or more? Is there a business model that will work for filling stations when people can also charge up at home, the office, or the mall? One advantage they can bring is faster fill-ups: as little as 10 to 20 minutes vs. many hours when using a standard charger at home. And they typically occupy prime locations with lots of traffic, where tired and hungry drivers are likely to grab a coffee or a snack while charging their cars.
At the Fulham facility, fully charging a Tesla Model 3 takes 30 min and can cost more than £35


2022-10-20: Drastically faster charging allows for much smaller batteries, which is great for battery supply, car efficiency and cost. The fastest Tesla supercharger takes 20 min and is not recommended for daily use.

A breakthrough in electric vehicle battery design has enabled a 10-minute charge time for a typical EV battery. “Our fast-charging technology works for most energy-dense batteries and will open a new possibility to downsize electric vehicle batteries from 150 to 50 kWh without causing drivers to feel range anxiety. The smaller, faster-charging batteries will dramatically cut down battery cost and usage of critical raw materials such as cobalt, graphite and lithium, enabling mass adoption of affordable electric cars.

The technology relies on internal thermal modulation, an active method of temperature control to demand the best performance possible from the battery. Batteries operate most efficiently when they are hot, but not too hot. Keeping batteries consistently at just the right temperature has been major challenge for battery engineers. Historically, they have relied on external, bulky heating and cooling systems to regulate battery temperature, which respond slowly and waste a lot of energy.

The researchers developed a new battery structure that adds an ultrathin nickel foil as the fourth component besides anode, electrolyte and cathode. Acting as a stimulus, the nickel foil self-regulates the battery’s temperature and reactivity which allows for 10-minute fast charging on just about any EV battery.

2022-11-11: Tesla opensources their charger (as previously predicted)

With more than 10 years of use and 30b EV charging km to its name, the Tesla charging connector is the most proven in North America, offering AC charging and up to 1 MW DC charging in one slim package. It has no moving parts, is 50% the size, and 2x as powerful as Combined Charging System (CCS) connectors.

In pursuit of our mission to accelerate the world’s transition to sustainable energy, today we are opening our EV connector design to the world. We invite charging network operators and vehicle manufacturers to put the Tesla charging connector and charge port, now called the North American Charging Standard (NACS), on their equipment and vehicles. NACS is the most common charging standard in North America: NACS vehicles outnumber CCS 2:1, and Tesla’s Supercharging network has 60% more NACS posts than all the CCS-equipped networks combined.


2022-11-28: Dumb scaling beats working with local mafias.

Charging EVs in parking lots with solar power is a marriage made in heaven. But the general rule for any solar or charging installation is that it be grid tied, so it can charge vehicles from the grid when the sun is not shining, and feed excess power back to the grid when the cars are not charging. Beam builds their stations in their factory, at scale — which is a big cost win — and then ships them on a flatbed trailer to the site, where they are simply dropped in any sunny parking spot. Without permits or contractors this can be done immediately, not months later. The Beam system is not cheap, however. Just cheaper for some locations than the high cost of traditional install.

Castle Reconstruction

6 Ruined British Castles Come Back to Life

Onward and NoeMam Studios have joined forces to digitally reconstruct 6 ruined castles across England, Scotland, Wales, and Northern Ireland. The series of gifs sees the castles fluidly re-emerge from the landscape, retelling the sense of place by showing “the true splendor enjoyed and defended by yesteryear’s barons, queens, and kings.

College of Extraordinary Experiences

COEE is at minimum 3 things: a College, an Extraordinary Experience, and a Community. First, it’s a full-fledged College: a place for higher education and intellectual discourse, offering hands-on, real-world crash courses on Experience Design. Following 3 guiding principles — Rapid Prototyping, Co-Creation, and Flexible Focus — this intense 5-day event has the flavor of an “unconference.” There are a few loosely structured activities, as the core of the program is a co-created and co-designed immersive learning space. Information, ideas and practices flow among participants through facilitated group discussions, thought-provoking workshops (where PowerPoint presentations are adamantly banned), and impromptu conversations. One wishes all learning was as enjoyable, and all enjoyment as profound. Second, like a nested Russian matryoshka doll, COEE is itself an Extraordinay Experience, self-reflectively focusing on Extraordinary Experiences. It’s like Hogwarts meets Disneyland, thoroughly spiced with Burning Man ethos and costuming. For 5 intense days and nights, you live in a real medieval castle, nestled in gorgeous natural surroundings of breathtaking beauty. Spectacular things happen in this unusual, immersive environment, stimulated by a parade of colorful and wild activities, and playful mind-bending events. You are quickly advised to come to terms with the FOMO syndrome: there is so much going on, you can’t get to, or even see, all of it. You’ll never know when and where the next thing will happen. Whatever is in store for you, however, will certainly deserve the term “extraordinary.”

NYC Street Tree Map

The NYC Parks department maintains an online map of the city’s street trees — currently 679K mapped trees from 422 different species. Our tree map includes every street tree in New York City as mapped by our TreesCount! 2015 volunteers, and is updated daily by our Forestry team. On the map, trees are represented by circles. The size of the circle represents the diameter of the tree, and the color of the circle reflects its species. You are welcome to browse our entire inventory of trees, or to select an individual tree for more information.

Drosophila Titanus

This is very cool. In some sense, we have a moral imperative to spread life in the cosmos.

Your experiment involves creating flies that could survive on Titan. I understand that Titan is incredibly cold so the flies have to gradually get used to the very low temperatures but what would be the impact of Titan’s orange sky and the low frequency radio waves that emanate from Titan on their bodies? And how do you prepare them for that? The project involved adapting the flies for a range of environmental conditions that are very different to those found on Earth. The cold is the most obvious along with the different atmospheric composition. There is also increased atmospheric pressure, radiation, chromatic characteristics and so on. To reach what could be conceived as the end of the project I would need to condition the flies for all of the characteristics of Titan. The radio waves experiment has been earmarked for a future stage in the project so I haven’t got too much to say about that right now. However, the chromatic adjustment has been something I’ve been working on over the last couple of years. The natural phototaxis of Drosophila – its instinct to move towards a certain type of light – is geared towards the blue end of the electromagnetic spectrum. To overcome this I kept the flies for a year under a Titan analog orange light before testing for adaptation. The selection experiment was modelled on a Y-Trap apparatus, a simple way of offering an organism 2 choices. The flies crawl up a tube and are faced with a junction offering orange light in one direction and blue light in the other, each tube ending with another non-return trap. Any flies taking the orange option are considered adapted and kept for breeding. Repeated iterations of the project smooth out random events.

Brain Research Notes

We are generally poor at describing our mental state. Our friends generally do a better job on identifying if we are depressed. 25% of the world will develop a serious brain malfunction in their lifetime. If you spend 10-100 hours in an fMRI, we can read your thoughts. 40 Hz flickering LED light induces gamma oscillations in the brain. After 1 hour of exposure, she saw a 50% reduction in amyloid plaques in a Alzheimer’s rat model. Expansion microscopy is the opposite of normal microscopy. Instead of zooming in on the brain, you make the brain bigger with a polymer expansion similar to the gel found in super-absorbent diapers. Ed Boyden’s team can trigger brain activity at a targeted deep area with a pair of acoustic waves at close frequencies, like 2.00 and 2.01 Hz