This directory contains WikiNet It contains the following files: ____ index.wiki - a direct index linking linguistic expressions of concepts (in various languages, as found in Wikipedia) with unique IDs. The structure of each line in the file is: "name" ID This file contains no language information (i.e., which language an expression corresponds to) ____ data.wiki - a relations file, containing all relations surrounding each concept. The structure of each line is: ID Relation1 ID11 ID12 ... ID1n Relation2 ID21 ID22 .... ... Relation names that start with a - indicate the inverse relation. For example, ID1 -CATEGORY ID2 ID3 indicate that ID1 is the category of (articles) ID2 and ID3. For both ID2 and ID3 there is the relation: ID2 CATEGORY ID1, ID3 CATEGORY ID1. All relations for a concept are on one line. ____ reversed_index.wiki - a reversed index file, which associates an ID to all its known lexicalizations found in Wikipedia. The structure of a line is: ID NE:(0|1) language1ID:name1 language2ID:name2 ... As opposed to the index file, we keep language information in the reversed index file to be able to generate readable output for user queries about the relations surrounding a word. This file contains only the name associated with the corresponding article/category for the languages found. NE is the named entity information (O is not NE, 1 it is). ____ reversed_index.all.wiki - similar to reversed_index.wiki, but all alternatives for a concept name are associated with the concept ID. In reversed_index.wiki we have one name per language. ____ cooccurrence.wiki - a list of (sentence) cooccurrences of concepts. The structure of each line is: ID1 ID2 ID3 .... The Concept with ID1 cooccurs in some sentence with ID2, ID3, ... ____ defs.wiki - the definitions for the existing concepts. These are the first line of each page that corresponds to the concept. Not all concepts have a definition -- happens mostly to those that correspond to categories. ____ generality.wiki - the generality of a concept measured as the distance from the top of Wikipedia's category hierarchy ____ hypoCounts.wiki - the number of hypernyms (as subsumed articles) for each concept ____ inLinks.wiki - for each concept, the list of concepts that link to it through their corresponding pages ____ outLinks.wiki - for each concept, the list of concepts that it links to through its corresponding page ________________________ stats - contains some statistics related to the data: number of relations for each relation type in the data (in the data.wiki file) number of entries per language We now also have a toolkit (Java-based) that can help with embedding WikiNet in applications. You can download it here: http://sourceforge.net/projects/wikinettk/