Tag: languages

Ruby Grammar Visualization

As part of the momentum surrounding the Ruby implementer’s summit, I have decided to take on a pet project to understand Ruby’s grammar better, with the goal of contributing to an implementation-independent specification of the grammar. Matz mentioned during his keynote how parse.y was one of the uglier parts of Ruby, but just how ugly?

with comparisons to java and javascript. fascinating, even though it is not apparent what it means

Neanderthals

Pääbo may have the entire Neanderthal genome sequenced in the next 18 months.

2009-02-13: Draft Genome is announced. Bookmarked also for the nice facial reconstruction.

2009-05-17: We ate them

Neanderthals met a violent end at our hands and in some cases we ate them

2010-09-28: The cloning arguments are nothing new, but I was struck by

There were no cities when the Neanderthals went extinct, and at their population’s peak there may have only been 10k of them spread across Europe. A cloned Neanderthal might be missing the genetic adaptations we have evolved to cope with the world’s greater population density, whatever those adaptations might be. But, not everyone agrees that Neanderthals were so different from modern humans that they would automatically be shunned as outcasts.

2013-08-16: Neanderthal leather-working

Excavations of Neanderthal sites 40 ka BP have uncovered a kind of tool that leather workers still use to make hides more lustrous and water resistant. The bone tools, known as lissoirs, had previously been associated only with modern humans. The latest finds indicate that Neanderthals and modern humans might have invented the tools independently.

2016-05-25: 176 ka ago is unimaginably old. This is more than 15x older than Gobekli Tepe.

After drilling into the stalagmites and pulling out cylinders of rock, the team could see an obvious transition between 2 layers. On one side were old minerals that were part of the original stalagmites; on the other were newer layers that had been laid down after the fragments were broken off by the cave’s former users. By measuring uranium levels on either side of the divide, the team could accurately tell when each stalagmite had been snapped off for construction.

Their date? 176 ka ago, give or take a few millennia. “When I announced the age to Jacques, he asked me to repeat it because it was so incredible”. Outside Bruniquel Cave, the earliest, unambiguous human constructions are just 20 ka old. Most of these are ruins—collapsed collections of mammoth bones and deer antlers. By comparison, the Bruniquel stalagmite rings are well-preserved and far more ancient.

2016-05-27: More Neanderthal than human

In some spots of our genome, we are more Neanderthal than human. the sequences we inherited from archaic hominins helped us survive and reproduce

2017-01-15: Neanderthals Were People, Too

For millenniums, some scientists believe, before modern humans poured in from Africa, the climate in Europe was exceptionally unstable. The landscape kept flipping between temperate forest and cold, treeless steppe. The fauna that Neanderthals subsisted on kept migrating away, faster than they could. Though Neanderthals survived this turbulence, they were never able to build up their numbers. (Across all of Eurasia, at any point in history, “there probably weren’t enough of them to fill a stadium.”) With the demographics so skewed, even the slightest modern human advantage would be amplified tremendously: a single innovation, something like sewing needles, might protect just enough babies from the elements to lower the infant mortality rate and allow modern humans to conclusively overtake the Neanderthals. And yet Stringer is careful not to conflate innovation with superior intelligence. Innovation, too, can be a function of population size. “We live in an age where information, where good ideas, spread like wildfire, and we build on them. But it wasn’t like that 50 ka ago.” The more members your species has, the more likely 1 member will stumble on a useful new technology — and that, once stumbled upon, the innovation will spread; you need sufficient human tinder for those sparks of culture to catch.

2017-09-05: 200 ka Neanderthal Glue

As far back as 200 ka ago Neanderthals were using a tar-based adhesive to glue axe heads and spears to their handles. Researchers have attempted to recreate the Neander-glue, which could help scientists figure out just how technologically sophisticated the species was. Archaeologists have found lumps of adhesive tar likely made from birch bark at Neanderthal sites in Italy and Germany. But just how they made the substance puzzled researchers, especially because they did it without the aid of ceramic pots, which were used by later cultures to produce large quantities of tar.

2019-06-12: Did Neanderthals Speak?

Neanderthals had the anatomical properties to create the sounds that could form the basis of speech, though any words they produced would have sounded a bit unfamiliar to modern human ears

2020-03-04: long distance Neanderthals

Their intercontinental odyssey over 1000s of kilometers is a rarely observed case of long-distance dispersal in the Paleolithic and highlights the value of stone tools as culturally informative markers of ancient population movements.

2022-11-19: Interbreeding in Africa

The human-like Y chromosome entered the Neanderthal gene pool well before the migration out of Africa 80ka BP – perhaps 270ka BP. Which means that many of the Neanderthals that those migrants encountered must have already had human-like Y chromosomes! The Neanderthal Y chromosome and mitochondrial DNA are 2 new lines of evidence that point to a much more complex and ancient relationship between us and our closest cousins than we otherwise would have known.

2023-12-15: A new book, The Naked Neanderthal, looks interesting

Next, he explores evidence from skeletal remains for butchery and cannibalism of the dead in Neanderthal communities at Moula Guercy. Some researchers have proposed that such findings are a sign of starvation — evidence that Neanderthals were not able to adapt to the warm Eemian forests. Slimak concludes instead that these behaviors were a natural part of hominin social interactions, citing growing evidence from both archaeology and primatology that such practices were relatively common among humans right through prehistory.
Humans temporarily replaced local Neanderthals 54ka BP over an extraordinarily short time — potentially less than 1 year. The author uses this to argue that extermination, rather than assimilation, is the most likely explanation for the Neanderthals’ eventual extinction.

Parrot has 950 word vocabulary

The bird has a vocabulary of 950 words, and shows signs of a sense of humor.

Sophonts

Do some elephants, at some age, develop the ability to think far into the future and pass the wisdom to their young? That is, is the incidence of “culture” among elephants the result of intellectual prognostication? No. If you eliminated all adult elephants, would the current “civilized” state of elephant culture eventually re-emerge after a number of generations? If so, after how many generations? Yes, with caveats.

Joshua riffs on the possible origins of the elephant society. Go read this article on elephant violence. It has the qualities of a seminal piece on cross-species relations. Consider this statement from a ugandan researcher who grew up in a war zone:

I started looking again at what has happened among the Acholi and the elephants. I saw that it is an absolute coincidence between the 2. All these kids who have grown up with their parents killed – no fathers, no mothers, only children looking after them. They form these roaming, violent, destructive bands. It’s the same thing that happens with the elephants. Just like the male war orphans, they are wild, completely lost.
Most people are scared of showing that kind of anthropomorphism. But coming from me it doesn’t sound like I’m inventing something. It’s there. People know it’s there. Some might think that the way I describe the elephant attacks makes the animals look like people. But people are animals.

Now we can either discuss the semantics of sentience as we recognize our peer species, hopefully before it is too late, or we can adopt a new term that is not laden with meaning that needs to be repurposed first. Sophonts works for me: Why look at far away stars when we can find peers right under our nose?

2006-10-30: another hurdle cleared

Elephants can recognize themselves in a mirror, joining only humans, apes and dolphins as animals that possess this kind of self-awareness

2007-12-30: More sophonts, unsurprisingly.

As recently as 10 years ago, the conventional wisdom doubted that even chimpanzees, which are more closely related to human beings than are monkeys, possessed theory of mind. This view is changing

2013-12-30: All sophont teenagers are the same

Dolphins ‘deliberately get high’ on puffer fish nerve toxins by carefully chewing and passing them around

2014-02-03: How is your self-definition of your human identity going?

This study describes how 3 individual fish developed a novel behavior and learnt to use a dorsally attached external tag to activate a self-feeder. This behavior was repeated up to several 100x, and over time these fish fine-tuned the behavior and made a series of goal-directed coordinated movements needed to attach the feeder’s pull string to the tag and stretch the string until the feeder was activated. These observations demonstrate a capacity in cod to develop a novel behavior utilizing an attached tag as a tool to achieve a goal. This may be seen as one of the very few observed examples of innovation and tool use in fish.”

2014-10-09: a preview of the legal climate as we uplift various sophonts.

A New York appeals court will consider this week whether chimpanzees are entitled to “legal personhood” in the first case of its kind.

2015-07-02: Meaning in bird song

A study of the chestnut-crowned babbler bird from Australia revealed a method of communicating that has never before been observed in animals. The bird combines sounds in different combinations to convey meaning. “It is the first evidence outside of a human that an animal can use the same meaningless sounds in different arrangements to generate new meaning. It’s a very basic form of word generation – I’d be amazed if other animals can’t do this too.” Babbler birds were found to combine 2 sounds (known as A and B) to generate calls associated with specific behaviors. In flight, they used an “A-B” call to make their whereabouts known, but when alerting chicks to food they combined the sounds differently to make “B-A-B”. The birds seemed to understand the meaning of the calls. When the feeding call was played back to them, they looked at nests, while when they heard a flight call they looked at the sky

2016-03-21: Ants recognize themselves

Our observations suggest that some ants can recognize themselves when confronted with their reflection view, this potential ability not necessary implicating some self awareness.

2022-02-17: Elks understand property?

Elk in Utah are smart enough to move off of public lands (where they can be hunted) and on to private lands where they cannot. And then, when hunting season is over, they shift right back to public lands. Elks’ use of public land diminished by 30% by the middle of rifle season. “It’s crazy; on the opening day of the hunt, they move, and on the closing day they move back. It’s almost like they’re thinking, ‘Oh, all these trucks are coming, it’s opening day, better move.’ They understand death. They get it; they’ve figured it out.”

2022-02-23: Chimpanzees treating their own wounds

Never before have scientists observed chimpanzees (or any animal) essentially “treating” a wound or applying a different animal species to a wound. It’s likely an example of allo-medication behavior (medicating others) in apes, which has never been seen before. The chimpanzees caught an insect from the air, which they immobilized by squeezing it between their lips. Then they placed it on an exposed surface of the wound and moved it around using their fingertips or lips. Finally, they extracted the insect from the wound.

2023-09-29: Crow statistical reasoning

2 crows had to choose between 2 images, each corresponding to a different reward probability. Crows were tasked with learning rather abstract quantities (i.e., not whole numbers), associating them with abstract symbols, and then applying that combination of information in a reward maximizing way. Over 10 days of training and 5k trials, the 1 crows continued to pick the higher probability of reward, showing their ability to use statistical inference.

Regaining speech via Rhyme

Where scott adams hacks his brain

Just because no one has ever gotten better from Spasmodic Dysphonia before doesn’t mean I can’t be the first. So every day for months and months I tried new tricks to regain my voice. I visualized speaking correctly and repeatedly told myself I could (affirmations). I used self hypnosis. I used voice therapy exercises. I spoke in higher pitches, or changing pitches. I observed when my voice worked best and when it was worst and looked for patterns. I tried speaking in foreign accents. I tried “singing” some words that were especially hard.

The day before yesterday, while helping on a homework assignment, I noticed I could speak perfectly in rhyme. Rhyme was a context I hadn’t considered. A poem isn’t singing and it isn’t regular talking. But for some reason the context is just different enough from normal speech that my brain handled it fine.

Jack be nimble, Jack be quick.
Jack jumped over the candlestick.

I repeated it 10s of times, partly because I could. It was effortless, even though it was similar to regular speech. I enjoyed repeating it, hearing the sound of my own voice working almost flawlessly. I longed for that sound, and the memory of normal speech. Perhaps the rhyme took me back to my own childhood too. Or maybe it’s just plain catchy. I enjoyed repeating it more than I should have. Then something happened.

My brain remapped.

My speech returned.

Not 100%, but close, like a car starting up on a cold winter night. And so I talked that night. A lot. And all the next day. A few times I felt my voice slipping away, so I repeated the nursery rhyme and tuned it back in. By the following night my voice was almost completely normal.

When I say my brain remapped, that’s the best description I have. During the worst of my voice problems, I would know in advance that I couldn’t get a word out. It was if I could feel the lack of connection between my brain and my vocal cords. But suddenly, yesterday, I felt the connection again. It wasn’t just being able to speak, it was KNOWING how. The knowing returned.

Translation

hmm

GT now gets 55% accuracy on English to Arabic. Human agreement on human translations is 60%. After this point they have no standard by which to measure their progress

2016-09-27: Getting amazingly close to human level performance. it’s interesting that for all languages, the gap between human and perfect translation is much much larger than between human and machine.

Neural Machine Translation: Much better translation quality
Full technical report (23 exciting pages of bedtime reading)

Research blog post

I’m very excited to announce that our new neural machine translation system closes the quality gap between the existing Google Translate production system and human quality translations by 58% to 87% for a variety of different language pairs (see table below, from the technical report we published today). This work has been a close collaboration between the Google Brain team and the Google Translate team.

Thanks to lots of hard engineering work and the computational efficiency of our Tensor Processing Units (see report), we are also rolling these benefits out to users of Google Translate, starting today with Mandarin to English as the first language pair live in production that uses this new system. We’ll be rolling out many more language pairs over the coming weeks.

This highlights the success of neural models at more accurately capturing the complexities of real human language, and is a powerful demonstration of the research our group has been doing on language understanding.

2016-11-15: Nice behind the scenes article on the recent translation breakthrough.

With this update, Google Translate is improving more in a single leap than we’ve seen in the last 10 years combined.

3 overlapping stories converge in Google Translate’s successful metamorphosis to A.I. — a technical story, an institutional story and a story about the evolution of ideas. The technical story is about 1 team on 1 product at 1 company, and the process by which they refined, tested and introduced a brand-new version of an old product in only about a quarter of the time anyone, themselves included, might reasonably have expected. The institutional story is about the employees of a small but influential artificial-intelligence group within that company, and the process by which their intuitive faith in some old, unproven and broadly unpalatable notions about computing upended every other company within a large radius. The story of ideas is about the cognitive scientists, psychologists and wayward engineers who long toiled in obscurity, and the process by which their ostensibly irrational convictions ultimately inspired a paradigm shift in our understanding not only of technology but also, in theory, of consciousness itself.

2023-07-08: Akkadian translation, with modest BLEU scores.

In its transliteration to English test, the AI model scored 37.47. In its cuneiform to English test, it scored 36.52. Both scores were above their target baseline and in the range of a high-quality translation. The model was able to reproduce the nuances of each test sentence’s genre. The AI model works best when it is translating short- to medium-length sentences. It also does better with more formulaic genres, like royal decrees and administrative records, than literary genres such as myths, hymns, and prophecies. With more training on a larger dataset, they aim to improve its accuracy. “100s of 100s of clay tablets inscribed in the cuneiform script document the political, social, economic, and scientific history of ancient Mesopotamia. Yet, most of these documents remain untranslated and inaccessible due to their sheer number and limited quantity of experts able to read them”

Content readability

opencms 60.83
plone 66.25
lenya 52.72
midgard 36.49
documentum 70.34
day no text, all images
interwoven “ill-formed tag”
from the flesch-kincaid readability test

Digital Unroll

Multi-spectral imaging technology is bringing a hoard of texts from antiquity back to life. I wonder if the hoard contains a copy of the second book of Aristotle’s Poetics, his missing treatise on comedy? Hopefully, it also contains ‘lesser works’ that would shed light on scenarios that were seriously considered by the relevant historical personalities, leading to possible alternative courses of history.

2013-12-19: Over 100 years ago, archaeologists discovered a 2 ka old trash dump near Oxyrhynchus in Egypt, chock full of 1000s of ancient documents, and preserved by the desert and pure chance. From Wikipedia on Oxyrhynchus:

Because Egyptian society under the Greeks and Romans was governed bureaucratically, and because Oxyrhynchus was the capital of the 19th nome, the material at the Oxyrhynchus dumps included vast amounts of paper. Accounts, tax returns, census material, invoices, receipts, correspondence on administrative, military, religious, economic, and political matters, certificates and licenses of all kinds—all these were periodically cleaned out of government offices, put in wicker baskets, and dumped out in the desert. Private citizens added their own piles of unwanted paper. Because papyrus was expensive, paper was often reused: a document might have farm accounts on one side, and a student’s text of Homer on the other. The Oxyrhynchus Papyri, therefore, contained a complete record of the life of the town, and of the civilizations and empires of which the town was a part.

In the century since they were uncovered, only a small fraction of the 1000 briefcase-sized storage boxes of papyrus fragments have been edited and published. There are ongoing efforts to speed this up using multispectral imaging, high resolution CT scanning, and transcription by crowdsourcing.

2013-12-23: Using CT imaging at the micron instead of a millimeter scale to virtually unroll a scroll and bring the libraries of Herculaneum back to life.

However, unraveling was still a problem so scientists kept searching for a mechanism by which to examine the scrolls while they remained closed.

A computer science professor from the University of Kentucky thought he had the answer. Working with 2 preserved Herculaneum scrolls, Brent Seales used micro-CT imaging techniques to attempt to “virtually unroll a scroll.” Micro-CT works at a higher resolution than regular CT scans, operating on the much-smaller micron scale instead of a millimeter scale. Experiments on similar objects seemed promising.

2015-11-17: X-ray phase-contrast tomography

Hundreds of papyrus rolls, buried by the eruption of Mount Vesuvius in 79 AD and belonging to the only library passed on from Antiquity, were discovered 260 years ago at Herculaneum. These carbonized papyri are extremely fragile and are inevitably damaged or destroyed in the process of trying to open them to read their contents. In recent years, new imaging techniques have been developed to read the texts without unwrapping the rolls. Until now, specialists have been unable to view the carbon-based ink of these papyri, even when they could penetrate the different layers of their spiral structure. Here for the first time, we show that X-ray phase-contrast tomography can reveal various letters hidden inside the precious papyri without unrolling them.

2022-03-09: Now combine this with ML to make sense of text fragments.

Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow. The architecture of Ithaca focuses on collaboration, decision support and interpretability. While Ithaca alone achieves 62% accuracy when restoring damaged texts, the use of Ithaca by historians improved their accuracy from 25% to 72%, confirming the synergistic effect of this research tool. Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history. This research shows how models such as Ithaca can unlock the cooperative potential between artificial intelligence and historians, transformationally impacting the way that we study and write about one of the most important periods in human history.

2023-04-04: What we might find at Herculaneum

There would have been a great deal else. Literature, history, science. Epistolaries, miscellanies, essays. Memoirs, novels, biographies. Satires. The work of orators and poets. Philosophy and mathematics. Scientific studies and technical manuals. Dictionaries and encyclopedias; and more. For example, a prominent Latin collector near to Rome is likely to have had the epistolaries (published letter collections) of Cicero. While we already have copies of those, finding editions scribed within decades of his death would still be of considerable use. More importantly, medieval Christians chose not to preserve almost all ancient literature; so there could be epistolaries from other authors here, famous and obscure. And even poets and orators and novelists, besides being priceless to recover just in respect to the history of art, would also have commented on various subjects of importance, such as popular religion and events.

State of NLP

the problems of NLP are highly interconnected are need to be solved as a whole, yet very few efforts have been made to attack several frontiers at once. most research is very academic and niche-oriented. assertion is that CPU power and amounts of data, such as wikipedia, will create breakthroughs in NLP.

Cultural transposition

Andrew Wilson has translated harry potter into ancient greek, the longest work to receive that treatment since 400 CE. wilson was very intent to recreate a version of the book which would make sense to a Greek from any era up to the 4th century AD who had managed by some magical process to reach the 21st century.
this led to some thorny problems:

Cultural problems There were many, one of the more obvious being
relationships – the patriarchal Greeks not really concerning themselves with relationships like mother’s sister
Time was another one – Greeks had little interest in “telling the time” although they did have devices for measuring how much had elapsed (water clocks for timing speeches, for example). Nor did they care about minutes, let alone seconds!
And colors – it’s little appreciated how languages divide up the visible spectrum of light in their own way – our red orange yellow etc is of course completely arbitrary- the spectrum is a continuum.

reminds me of douglas hofstadters Le Ton Beau De Marot.

A skilled literary translator makes a far larger number of changes, and far more significant changes, than any virtuoso performer of classical music would ever dare to make in playing notes in the score of, say, a Beethoven piano sonata. In literary translation, it’s totally humdrum stuff for new ideas to be interpreted, old ideas to be deleted, structures to be inverted, twisted around, and on and on.