Wikipedia disambiguation

In addition to pages describing different entities where contextual clues can be extracted (example), Wikipedia contains redirects for different surface forms of the same entity, list pages that categorize names, and disambiguation pages that show many of the different entities for a surface form. Wikipedia contains much more than unstructured text. Exploiting the semi-structured data — the redirect, list, and disambiguation pages — gives this work its power.

wikipedia for entity extraction. awesome what crowdsourcing can do.

Leave a comment