Tag: semweb

Potluck

the next marvel from the simile project. makes data munging almost bearable. uses google maps for good measure, of course.

Supervised labeling

A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning.

this system can produce tags on par with humans for many types of images.

Structured News

makes the point that news organizations could add a lot of value by marking news up properly with location, main actors, etc. they would do so out of SEO interest, but also do a huge favor for historians and knowledge representation.

Mindmap standard?

Eric Blue has A Call To Action: The Need For A Common Mind Map File Format. He provides some very good reasons for such a format, in particular the ability to share mindmaps on the web. Before going any further, I should say I take sharing things on the web as meaning with deep, granular access to the data/content, not just placing an opaque blob file on a server. Before getting to the format, there’s an underlying problem that may be harder. There isn’t as yet a sharable model of what a mindmap is.

maybe one day. mindmaps have been a long time coming but are still nowhere

Freebase enjoyment

Early indications are that Freebase is going to be a whole lot of fun. In his walkthrough Tim O’Reilly calls it addictive, and explains why. Because the system thinks in terms of relationships among types of items, a single act of data entry can produce multiple outcomes. Tim’s writeup gives a couple of examples of what that’s like. Here’s mine. I found a record for myself in the system, sourced from Wikipedia. I updated it to say that I’m the author of the book Practical Internet Groupware. Then I added that Tim O’Reilly was the editor of my book. That single edit altered the records on both ends of the author/editor relationship. My book’s record now showed Tim O’Reilly as its editor, and Tim’s record sprouted a Books Edited list that contained my book as its first item. Nice. This is just a Hello World example, of course, but it has the feel of something that people will be able to understand, will want to use, and will enjoy in a social way.

wow, if we can trick people into enjoying metadata creation, the sky is the limit. but beware metacrap

Advanced Tagging and TripleTags

geobloggers hmm, that is starting to look very rdf-like. via bergie

Fragment Search

Greasemonkey script which allows people to create URLs which link to content within a page without having control over that page. this is great for more accurate tagging, and adding structure where the original authors failed

Flickr API machine tags

more precise tags interspersed with ‘regular’ tags. formalizing the geo:lat etc stuff. smart!

Google Spreadsheets 2 KML

some people are trying really hard to bring the data web to life

15M foaf files

LJ is up to 15M accounts, and therefore that many FOAF files.