Measuring Semantic Relatedness

Sandipan Dey re-works a university assignment on semantic relatedness in Python:

Let’s define the semantic relatedness of two WordNet nouns x and y as follows:

  • A = set of synsets in which x appears
  • B = set of synsets in which y appears
  • distance(x, y) = length of shortest ancestral path of subsets A and B
  • sca(x, y) = a shortest common ancestor of subsets A and B

This is the notion of distance that we need to use to implement the distance() and sca() methods in the WordNet data type.

It looks like a helpful assignment for understanding natural language processing a little better.

Related Posts

Housing Prices In Ames, Iowa: A Kaggle Competition

Kathryn Bryant and M. Aaron Owen share their Kaggle experiences.  First, Kathryn, et al: The lifecycle of our project was a typical one. We started with data cleaning and basic exploratory data analysis, then proceeded to feature engineering, individual model training, and ensembling/stacking. Of course, the process in practice was not quite so linear and […]

Read More

Picking A Python IDE

Kevin Jacobs reviews a few Python IDEs from the perspective of a data scientist: Ladies and gentlemens, this is one of the most perfect IDEs for editing your Python code! At least in my opinion. Jupyter notebook is a web based code editor and can quickly generate visualizations. You can mix up code and text […]

Read More

Categories

October 2017
MTWTFSS
« Sep Nov »
 1
2345678
9101112131415
16171819202122
23242526272829
3031