germanetpy.path_based_relatedness_measures

Classes

PathBasedRelatedness(germanet, category[, ...])

These measures use the GermaNet Graph to compute the shortest Paths between two concepts.

SemRelMeasure(*args, **kwargs)

This Enum represents the semantic relatedness measures

class germanetpy.path_based_relatedness_measures.PathBasedRelatedness(germanet, category, max_len: int = None, max_depth: int = None, synset_pair=None)[source]

Bases: object

These measures use the GermaNet Graph to compute the shortest Paths between two concepts. These concepts have to have the same word category. The path lengths are normalized in different ways (depending on the measure). The path lengths are computed taking only the hypernymy / hyponymy relations into account

simple_path(synset1, synset2, normalize: bool = False, normalized_max: float = 1.0) float[source]

This measure computes the pathlength and normalizes it by the longest possible shortest path between any two nodes of the corresponding word category.

Parameters:
  • synset1 (Synset) – The source synset

  • synset2 (Synset) – The target synset the source synset is compared to

  • normalize – The relatedness value can be normalized to a number between the possible minimum of that measure and a given upper bound.

  • normalized_max – The upper bound of the range the measure is normalized to.

Returns:

: The normalized path length between two synsets

init_min_max_normalization_values(synset_pair)[source]

This methods computes the minimal values (two synsets are equal) and the maximum values (two synsets are maximally appart in the graph) for normalization

Parameters:

synset_pair – (Synset, Synset) The Tuple of synsets that have the maximum distance in the graph

Returns:

a dictionary [SemRelMeasure : (int, int)] containing the (minimum value, maximum value) for each semantic similarity measure.

wu_and_palmer(synset1, synset2, normalize: bool = False, normalized_max: float = 1.0) float[source]

This methods computes the semantic relatedness by taking the path length into account, normalizing by taking the depth of the LCS. If there are several possible LCS, the one with the largest depth is taken into account.

Parameters:
  • synset1 (Synset) – The source synset

  • synset2 (Synset) – The target synset the source synset is compared to

  • normalize – The relatedness value can be normalized to a number between the possible minimum of that measure and a given upper bound.

  • normalized_max – The upper bound of the range the measure is normalized to.

Returns:

The wu and palmer relatedness measure

leacock_chodorow(synset1, synset2, normalize: bool = False, normalized_max: float = 1.0) float[source]

This method implements the leackock and chodorow relatedness measure. For the path distance and depth, node count is used.

Parameters:
  • synset1 (Synset) – The source synset

  • synset2 (Synset) – The target synset the source synset is compared to

  • normalize – The relatedness value can be normalized to a number between the possible minimum of that measure and a given upper bound.

  • normalized_max – The upper bound of the range the measure is normalized to.

Return::

The leackock and chodorow relatedness measure

normalize(raw_value: float, normalized_max: float, semrel_measure: SemRelMeasure) float[source]

Normalizes a raw value of semantic relatedness to a value between a lower bound and the given upper bound.

Parameters:
  • raw_value – The raw value

  • normalized_max – The upper bound

  • semrel_measure – The semantic relatedness measure, the value corresponds to.

Returns:

The normalized semantic relatedness value

property germanet
property max_len
property max_depth
property category
property normalization_dic