org.wikipedia.miner.model
Class Disambiguation

java.lang.Object
  extended by org.wikipedia.miner.model.Page
      extended by org.wikipedia.miner.model.Article
          extended by org.wikipedia.miner.model.Disambiguation
All Implemented Interfaces:
java.lang.Comparable<Page>

public class Disambiguation
extends Article

This class represents disambiguation pages in Wikipedia; the pages that list the various articles that an ambiguous term may refer to.

On top of the functionality provided by Article, it attempts to identify the linked articles which relate to alternative senses for the term in question. This is done through the following heuristics:

Author:
David Milne

Nested Class Summary
 
Nested classes/interfaces inherited from class org.wikipedia.miner.model.Article
Article.AnchorText
 
Field Summary
 
Fields inherited from class org.wikipedia.miner.model.Page
ARTICLE, CATEGORY, DISAMBIGUATION, REDIRECT
 
Constructor Summary
Disambiguation(WikipediaDatabase database, int id)
          Initializes a newly created DisambiguationPage so that it represents the disambiguation page given by id.
Disambiguation(WikipediaDatabase database, int id, java.lang.String title)
          Initializes a newly created DisambiguationPage so that it represents the page given by id and title.
Disambiguation(WikipediaDatabase database, java.lang.String title)
          Initializes a newly created DisambiguationPage so that it represents the disambiguation page given by title.
 
Method Summary
 SensePage getMostObviousSense()
          Returns the most obvious or most common sense of the ambiguous term, by selecting the first article that the disambiguation page links to.
 SortedVector<SensePage> getSenses()
           
 java.util.Vector<SensePage> getSensesInPageOrder()
          Returns all senses of the ambiguous term, in the order they were found on the page.
static void main(java.lang.String[] args)
          Provides a demo of functionality available to Disambiguations
 
Methods inherited from class org.wikipedia.miner.model.Article
getAnchorTexts, getAvaliableLanguages, getEquivalentCategory, getLinksIn, getLinksInCount, getLinksInIds, getLinksOut, getLinksOutCount, getLinksOutIds, getParentCategories, getParentCategoryIds, getRedirects, getRelatednessTo, getTranslation, getTranslations
 
Methods inherited from class org.wikipedia.miner.model.Page
compareTo, createPage, equals, getContent, getFirstParagraph, getFirstSentence, getGenerality, getId, getScope, getTitle, getTitleWithoutScope, getType, getWeight, setWeight, toString
 
Methods inherited from class java.lang.Object
getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Disambiguation

public Disambiguation(WikipediaDatabase database,
                      int id,
                      java.lang.String title)
Initializes a newly created DisambiguationPage so that it represents the page given by id and title. This is the most efficient constructor as no database lookup is required.

Parameters:
database - an active WikipediaDatabase
id - the unique identifier of the disambiguation page
title - the (case dependent) title of the disambiguation page

Disambiguation

public Disambiguation(WikipediaDatabase database,
                      int id)
               throws java.sql.SQLException
Initializes a newly created DisambiguationPage so that it represents the disambiguation page given by id.

Parameters:
database - an active WikipediaDatabase
id - the unique identifier of the disambiguation page
Throws:
java.sql.SQLException - if no page is defined for the id, or if it is not a disambiguation page.

Disambiguation

public Disambiguation(WikipediaDatabase database,
                      java.lang.String title)
               throws java.sql.SQLException
Initializes a newly created DisambiguationPage so that it represents the disambiguation page given by title.

Parameters:
database - an active WikipediaDatabase
title - the (case dependent) title of the disambiguation page
Throws:
java.sql.SQLException - if no disambiguation page is defined for the title.
Method Detail

getMostObviousSense

public SensePage getMostObviousSense()
                              throws java.sql.SQLException
Returns the most obvious or most common sense of the ambiguous term, by selecting the first article that the disambiguation page links to.

Returns:
the most obvious (first) sense listed.
Throws:
java.sql.SQLException - if there is a problem with the Wikipedia database.

getSenses

public SortedVector<SensePage> getSenses()
                                  throws java.sql.SQLException
Returns:
all senses of the ambiguous term, ordered by page id.
Throws:
java.sql.SQLException - if there is a problem with the Wikipedia database.

getSensesInPageOrder

public java.util.Vector<SensePage> getSensesInPageOrder()
                                                 throws java.sql.SQLException
Returns all senses of the ambiguous term, in the order they were found on the page. This order usually correlates with how obvious or well known each sense is.

Returns:
see above
Throws:
java.sql.SQLException - if there is a problem with the wikipedia database.

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Provides a demo of functionality available to Disambiguations

Parameters:
args - an array of arguments for connecting to a wikipedia database: server and database names at a minimum, and optionally a username and password
Throws:
java.lang.Exception - if there is a problem with the wikipedia database.