Tag: ai

Competitive AI

OpenAI:

We built a neural theorem prover for Lean that learned to solve a variety of challenging high-school olympiad problems. These problems are not standard math exercises, they are used to let the best high-school students compete against each other. The prover uses a language model to find proofs of formal statements. Each time we find a new proof, we use it as new training data, which improves the neural network and enables it to iteratively find solutions to harder and harder statements. We achieved a new state-of-the-art (41.2%) on the miniF2F benchmark, a challenging collection of high-school olympiad problems.

DeepMind:

we created a system called AlphaCode that writes computer programs at a competitive level. AlphaCode achieved an estimated rank within the top 54% of participants in programming competitions by solving new problems that require a combination of critical thinking, logic, algorithms, coding, and natural language understanding.

Most important century

The “most important century” series of blog posts argues that the 21st century could be the most important century ever for humanity, via the development of advanced AI systems that could dramatically speed up scientific and technological advancement, getting us more quickly than most people imagine to a deeply unfamiliar future.

  • The long-run future is radically unfamiliar. Enough advances in technology could lead to a long-lasting, galaxy-wide civilization that could be a radical utopia, dystopia, or anything in between.
  • The long-run future could come much faster than we think, due to a possible AI-driven productivity explosion.
  • The relevant kind of AI looks like it will be developed this century – making this century the one that will initiate, and have the opportunity to shape, a future galaxy-wide civilization.
  • These claims seem too “wild” to take seriously. But there are a lot of reasons to think that we live in a wild time, and should be ready for anything.
  • We, the people living in this century, have the chance to have a huge impact on huge numbers of people to come – if we can make sense of the situation enough to find helpful actions. But right now we aren’t ready for this.

Open-Ended Learning

Today, we published “Open-Ended Learning Leads to Generally Capable Agents,” a preprint detailing our first steps to train an agent capable of playing many different games without needing human interaction data. We created a vast game environment we call XLand, which includes many multiplayer games within consistent, human-relatable 3D worlds. This environment makes it possible to formulate new learning algorithms, which dynamically control how an agent trains and the games on which it trains. The agent’s capabilities improve iteratively as a response to the challenges that arise in training, with the learning process continually refining the training tasks so the agent never stops learning. The result is an agent with the ability to succeed at a wide spectrum of tasks — from simple object-finding problems to complex games like hide and seek and capture the flag, which were not encountered during training. We find the agent exhibits general, heuristic behaviors such as experimentation, behaviors that are widely applicable to many tasks rather than specialized to an individual task. This new approach marks an important step toward creating more general agents with the flexibility to adapt rapidly within constantly changing environments.

Restoring The Night Watch

The missing edges of Rembrandt’s painting The Night Watch have been restored using artificial intelligence. The canvas, created in 1642, was trimmed in 1715 to fit between 2 doors at Amsterdam’s city hall.

Rembrandt actually used 4 different colors to paint a miniscule light effect in the eye of one of the many life-sized protagonists featured in this group portrait, which probably wouldn’t be seen by anybody anyway.

AI Labor Supply

61.6% of the working-age population were active in the labor force, either working in jobs or looking for them. That is essentially unchanged from the summer of 2020. Wages are soaring. In May, average wages grew at a 6.1% annual rate. In April, they grew at an 8.7% annual rate. Employers are boosting wage offers in order to attract and retain workers, who are increasingly difficult to attract and retain.

Unable to find enough workers, Lee’s Famous Recipe Chicken, installed an automated voice system to take orders. The system never fails to upsell customers on fries or a drink, which has boosted sales. There’s no longer a need for a person to take orders at the drive-thru window. “It also never calls in sick. There’s no way we’re going back.”

A massive shift to delivery and virtual kitchens triggered by the pandemic may mean that some restaurants and some customers will be more willing to use technology that once seemed unfamiliar. Using an app to order at a restaurant table could mean that, eventually, fewer servers will be needed.

UAV L1 Autonomy Safety

EASA has a roadmap for autonomous flight with 3 levels of autonomy:


They, in collaboration with my friends at Daedalean, just released their approach how to certify the safety of the whole L1 system, a first for a ML system in aviation, as far as I know. This ought to help the nascent UAV market with overcoming regulatory barriers. You can get a sense for the state of the art with the EHang 216 drone in this autonomous test flight with the CEO on board.

Moore’s Law for Everything

what a permanent stimulus could look like, to ease us into UBI:

We should focus on taxing capital rather than labor, and we should use these taxes as an opportunity to directly distribute ownership and wealth to citizens. The best way to improve capitalism is to enable everyone to benefit from it directly as an equity owner. The 2 dominant sources of wealth will be 1) companies, particularly ones that make use of AI, and 2) land, which has a fixed supply.

The amount of wealth available to capitalize the American Equity Fund would be significant. There is about $50t worth of value in US companies alone. This will at least double over the next 10 years.

There is also $30t of privately-held land in the US. Assume that this value will double, too, over the next 10 years. If we increase the tax burden on holding land, its value will diminish relative to other investment assets, which is a good thing for society because it makes a fundamental resource more accessible and encourages investment instead of speculation. The value of companies will diminish in the short-term, too, though they will continue to perform quite well over time.

Under the above set of assumptions (current values, future growth, and the reduction in value from the new tax), 10 years from now each of the 250m adults in America would get about $13k every year. That dividend could be much higher if AI accelerates growth, but even if it’s not, $13k will have much greater purchasing power than it does now because technology will have greatly reduced the cost of goods and services. And that effective purchasing power will go up dramatically every year.

10x H.264

We propose a neural talking-head video synthesis model and demonstrate its application to video conferencing. Our model learns to synthesize a talking-head video using a source image containing the target person’s appearance and a driving video that dictates the motion in the output. Our motion is encoded based on a novel keypoint representation, where the identity-specific and motion-related information is decomposed unsupervisedly. Extensive experimental validation shows that our model outperforms competing methods on benchmark datasets. Moreover, our compact keypoint representation enables a video conferencing system that achieves the same visual quality as the commercial H.264 standard while only using one-tenth of the bandwidth. Besides, we show our keypoint representation allows the user to rotate the head during synthesis, which is useful for simulating a face-to-face video conferencing experience. Our adaptive and 20 keypoint scheme obtains 10.37 and 6.5 reduction in bandwidth compared to the H.264 codec, respectively.

Multimodal Neurons

Using the tools of interpretability, we give an unprecedented look into the rich visual concepts that exist within the weights of CLIP. Within CLIP, we discover high-level concepts that span a large subset of the human visual lexicon—geographical regions, facial expressions, religious iconography, famous people and more. By probing what each neuron affects downstream, we can get a glimpse into how CLIP performs its classification.