germanetpy package¶

Submodules¶

germanetpy.compoundInfo module¶

class germanetpy.compoundInfo.CompoundCategory(*args: Any, **kwargs: Any)[source]¶

Bases: Enum

This Enum represents the syntactic wordcategory a modifier of a compound can belong to.

Adjektiv = 'Adjektiv'¶

Nomen = 'Nomen'¶

Verb = 'Verb'¶

Adverb = 'Adverb'¶

Präposition = 'Präposition'¶

Partikel = 'Partikel'¶

Pronomen = 'Pronomen'¶

class germanetpy.compoundInfo.CompoundProperty(*args: Any, **kwargs: Any)[source]¶

Bases: Enum

This Enum represents the properties a compound constituent can have.

Abkürzung = 'Abkürzung'¶

Affixoid = 'Affixoid'¶

Fremdwort = 'Fremdwort'¶

Konfix = 'Konfix'¶

Wortgruppe = 'Wortgruppe'¶

Eigenname = 'Eigenname'¶

opaquesMorphem = 'opaquesMorphem'¶

virtuelleBildung = 'virtuelleBildung'¶

gebundenesMorphem = 'gebundenesMorphem'¶

freiesMorphem = 'freiesMorphem'¶

nominalisiertesVerb = 'nominalisiertesVerb'¶

class germanetpy.compoundInfo.CompoundInfo(modifier1, head, modifier2=None, modifier1property=None, modifier1category=None, mod1LexUnitId1=None, mod1LexUnitId2=None, mod1LexUnitId3=None, modifier2property=None, modifier2category=None, mod2LexUnitId1=None, mod2LexUnitId2=None, mod2LexUnitId3=None, headproperty=None, headLexUnitId=None)[source]¶

Bases: object

PROPERTY = 'property'¶

CATEGORY = 'category'¶

XML_LEX_UNIT_ID = 'lexUnitId'¶

XML_LEX_UNIT_ID2 = 'lexUnitId2'¶

XML_LEX_UNIT_ID3 = 'lexUnitId3'¶

property modifier1¶

property modifier1_property¶

property modifier1_category¶

property mod1_LexUnitId1¶

property mod1_LexUnitId2¶

property mod1_LexUnitId3¶

property modifier2¶

property modifier2_property¶

property modifier2_category¶

property mod2_LexUnitId1¶

property mod2_LexUnitId2¶

property mod2_LexUnitId3¶

property head¶

property head_property¶

property head_LexUnitId¶

germanetpy.filterconfig module¶

class germanetpy.filterconfig.Filterconfig(search_string: str, ignore_case: bool = False, regex: bool = False, levenshtein_distance: int = 0)[source]¶

Bases: object

This class is a configuration object, that helps to filter GermaNets lexical units and Synsets to extract the ones with certain interesting properties.

filter_lexunits(germanet) → set[source]¶

Applys the filter to the GermaNet data

Parameters:: germanet (Germanet) – the GermaNet object, loaded from the data
Returns:: a set of lexical units that are left after retrieval is filtered with the given constraints

filter_synsets(germanet) → set[source]¶

Applys the filter to the GermaNet data

Parameters:: germanet (Germanet) – the GermaNet object, loaded from the data
Returns:: a set of synsets that are left after retrieval is filtered with the given constraints

property search_string¶

property ignore_case¶

property regex¶

property levenshtein_distance¶

property word_classes¶

property word_categories¶

property orth_variants¶

germanetpy.frames module¶

class germanetpy.frames.Frames(frames2lexunits: dict)[source]¶

Bases: object

EXPLETIVE = 'NE'¶

SUBJECT = 'NN'¶

ACCOBJ = 'AN'¶

DATOBJ = 'DN'¶

GENOBJ = 'GN'¶

PREPOBJ = 'PP'¶

LOC = 'BL'¶

DIR = 'BD'¶

TEMP = 'BT'¶

MAN = 'BM'¶

INST = 'BS'¶

CAUSE = 'BC'¶

ROLE = 'BR'¶

COM = 'BO'¶

reflexives = ['DR', 'AR']¶

extract_expletives() → set[source]¶

This method extracts all verbs that can take expletives as an argument. Example: “[Es] regnet.”

Returns:: A set of lexical units that stores all verbs as Lexunits that have the specified frame.

extract_accusative_complement() → set[source]¶

This method returns all verbs that can take an accusative complement. Example: “Sie sieht [ihn]”

Returns:: A set of lexical units that stores all verbs as Lexunits that have the specified frame.

extract_dative_complement() → set[source]¶

This method returns all verbs that can take an dative complement. Example: “Sie schenkt [ihm] einen Hund.”

Returns:: A set of lexical units that stores all verbs as Lexunits that have the specified frame.

extract_gentive_complement() → set[source]¶

This method returns all verbs that can take an genetive complement. Example: “Ihre Eltern berauben sie [ihrer Freiheit].”

Returns:: A set of lexical units that stores all verbs as Lexunits that have the specified frame.

extract_prepositional_complement() → set[source]¶

This method returns all verbs that can take an prepositional complement. Example: “Die Kugel klackte [an die Fensterscheibe].”

Returns:: A set of lexical units that stores all verbs as Lexunits that have the specified frame.

extract_reflexives() → set[source]¶

This method returns all verbs that can take an reflexive complement. Example: “Sie wird [sich] rächen.”

Returns:: A set of lexical units that stores all verbs as Lexunits that have the specified frame.

extract_adverbials() → set[source]¶

This method returns all verbs that can take an adverbial complement. Example: “Sie wohnt [in einem Haus].”

Returns:: A set of lexical units that stores all verbs as Lexunits that have the specified frame.

extract_transitives() → set[source]¶

This method returns all transitive verbs. A transitive verb is any verb that can have objects.

Returns:: A set of lexical units that stores all transitive verbs as Lexunits.

extract_intransitives() → set[source]¶

This method returns all intransitive verbs. An intransitive verb is any verb that does not have objects.

Returns:: A set of lexical units that stores all intransitive verbs as Lexunits.

extract_specific_complements(complement: str) → set[source]¶

This method returns all verbs that can take a given complement. This is specified in the frames of a verb.

Param:: complement : a syntactic complement (e.g NN for subject), the complements are specified as class variables of this class
Returns:: A set of lexical units that stores all verbs as Lexunits that can take the specified complement.

property frames2verbs¶

germanetpy.germanet module¶

class germanetpy.germanet.Germanet(datadir: str, add_ilirecords: bool = True, add_wiktionary: bool = True)[source]¶

Bases: object

get_synsets_by_orthform(form: str, ignorecase: bool = False) → list[source]¶

This method returns a list of synsets that match the given input search string

Parameters:

form – a word that can be looked up in the GermaNet
ignorecase – whether the case of the word should be ignored (default = False)

Returns:

a list of synsets

get_synsets_by_wordcategory(category) → list[source]¶

Returns a list of synsets that belong to the specified word category

Parameters:: category (WordCategory) – The word category of interest
Returns:: A list of Synsets that belong to the specified word category

get_synsets_by_wordclass(wordclass) → list[source]¶

Returns a list of synsets that belong to the specified word class

Parameters:: wordclass (WordClass) – The word category of interest
Returns:: A list of Synsets that belong to the specified word class

get_synset_by_id(id: str)[source]¶

Returns a Synset by a specified identifier (if that exists, otherwise raises an Error)

Return type:: Synset
Parameters:: id – a Synset identifier
Returns:: The matching Synset object

get_lexunit_by_id(id: str)[source]¶

Returns a lexical unit by a specified identifier (if that exists, otherwise raises an Error)

Return type:: Lexunit
Parameters:: id – a Lexunit identifier
Returns:: The matching Lexunit object

get_lexunits_by_orthform(form: str, ignorecase: bool = False) → list[source]¶

This method returns a list of lexical units that match the given input search string

Parameters:

form – a word that can be looked up in the GermaNet
ignorecase – whether the case of the word should be ignored (default = False)

Returns:

a list of lexical units that match the given input query

get_lexunits_by_wordclass(wordclass) → list[source]¶

Returns a list of lexical units that belong to the specified word class

Parameters:: wordclass (WordClass) – The word category of interest
Returns:: A list of lexical units that belong to the specified word class

get_lexunits_by_wordcategory(category) → list[source]¶

Returns a list of lexical units that belong to the specified word category

Parameters:: category (WordCategory) – The word category of interest
Returns:: A list of lexical units that belong to the specified word category

get_synsets_by_frame(frame: str) → list[source]¶

Returns a list of Synsets that match a specified frame

Parameters:: frame – a frame that describes the argument structure of a verb (e.g. ‘NN.AN’ specifies that a verb can take a subject and accusative object as arguments.)
Returns:: a list of Synsets that match the given frame. If the frame is not valid an Assertion Error will be raised

property lexunits¶

property synsets¶

property orthform2lexid¶

property mainOrtform2lexid¶

property lowercasedform2lexid¶

property wordcat2lexid¶

property wordclass2lexid¶

property compounds¶

property frames2lexunits¶

property wiktionary_entries¶

property ili_records¶

property frames¶

property root¶

property datadir¶

property add_ilirecords¶

property add_wiktionary¶

germanetpy.icbased_similarity module¶

class germanetpy.icbased_similarity.ICBasedSimilarity(germanet, wordcategory, path: str, separator: str = '\t')[source]¶

Bases: object

The IC-based measures are computed based on relative frequencies of words in a large corpus. Synset frequencies are computed by adding up the frequencies of all words that belong to a Synset. These measures can not be computed between synsets with different word categories

create_simple_freq_dic(word_category, path: str, separator: str)[source]¶

Reads in the frequency list files and stores the frequency information for each Synset in a dictionary. The keys are the Synset IDs. This method also adds all available Synset frequencies for the given category.

Parameters:

word_category (WordCategory) – The word category
path – The path to a frequency list containing words and their frequencies in a corpus
separator – The char that separates a word and its frequency in the given frequency list

init_min_max_normalization_values(synset_pair) → dict[source]¶

This methods computes the minimal values (two Synsets are equal) and the maximum values (two Synsets are maximally apart in the graph) for normalization

Parameters:: synset_pair (tuple(Synset, Synset)) – The Tuple of synsets that have the maximum distance in the graph
Returns:: a dictionary containing the (minimum value, maximum value) for each semantic similarity measure.

init_ic_map()[source]¶

Computes the information content for each synset in GermaNet (of a given word category).

Return type:: dict, Synset
Returns:: A dictionary with a Synset and the corresponding IC, a Synset with the highest IC

get_information_content(synset) → float[source]¶

The information content graduates semantic concepts from general to specific. The more specific a concept, the smaller the probability and thus the higher its informativeness. The information content of a semantic con- cept is estimated by the relative frequency of the concept in a large corpus (cumulated synset frequency)

Parameters:: synset (Synset) – the information content should be computed for
Returns:: the information content for the given synset

resnik(synset1, synset2, normalize: bool = False, normalized_max: float = 1.0) → float[source]¶

Two concepts are more related the more information they share. The shared information of two concepts can be quantified by the information content of two concepts’ lowest common subsumer. When several LCS are available the highest IC is returned.

Parameters:

synset1 (Synset) – The source synset
synset2 (Synset) – The target synset
normalize – The relatedness value can be normalized to a number between the possible minimum of that measure and a given upper bound.
normalized_max – The upper bound of the range the measure is normalized to.

Returns:

The information content of the LCS of the two given synsets.

jiang_and_conrath(synset1, synset2, normalize: float = False, normalized_max: float = 1.0) → float[source]¶

The Jiang and Conraths measure includes knowledge about the individual information contents of each synset. The smaller the difference of the information content of the two synsets, the more related they are.

Parameters:

synset1 (Synset) – The source synset
synset2 (Synset) – The target synset
normalize – The relatedness value can be normalized to a number between the possible minimum of that measure and a given upper bound.
normalized_max – The upper bound of the range the measure is normalized to.

Returns:

The jiang and conrath relatedness measure

lin(synset1, synset2, normalize: bool = False, normalized_max: float = 1.0) → float[source]¶

The lin measure takes the individual information contents of each synset and the information content of the LCS into account. The LCS with the highest information content is used for the computation.

Parameters:

synset1 (Synset) – The source synset
synset2 (Synset) – The target synset
normalize – The relatedness value can be normalized to a number between the possible minimum of that measure and a given upper bound.
normalized_max – The upper bound of the range the measure is normalized to.

Returns:

The Lin relatedness measure

normalize(raw_value: float, normalized_max: float, semrel_measure: SemRelMeasure) → float[source]¶

Normalizes a raw value of semantic relatedness to a value between a lower bound and the given upper bound.

Parameters:

raw_value – The raw value
normalized_max – The upper bound
semrel_measure – The semantic relatedness measure, the value corresponds to.

Returns:

The normalized semantic relatedness value

property germanet¶

property root_freq¶

property synset2cumfreq¶

property jcnmaxdist¶

property normalization_dic¶

property synset2ic¶

property most_informative_synset¶

property synset2simple_freq¶

germanetpy.iliLoader module¶

germanetpy.iliLoader.create_ili_record(attributes, synonyms) → IliRecord[source]¶

Creates the ili record given the XML attributes.

Parameters:

attributes (xml attributes) – The XML attributes that contain the required information about the ili record.
synonyms (list(String)) – A list of Strings, containing the synonyms of the ili record.

Returns:

The ili record object

germanetpy.iliLoader.load_ili(germanet, tree)[source]¶

This method creates the ili record objects given a datafile and adds them to the GermaNet object and the corresponding lexical unit.

Parameters:

germanet (Germanet) – The GermaNet object
tree (Element Tree) – The XML tree containing the data about the ili records

germanetpy.iliRecord module¶

class germanetpy.iliRecord.IliRecord(lexunit_id: str, ewnRelation: str, pwnWord: str, pwn20Id: str, pwn30Id: str, source: str, pwn20synonyms: list, pwn20paraphrase: str = None)[source]¶

Bases: object

property lexunit_id¶

property relation¶

property english_equivalent¶

property pwn20id¶

property pwn30id¶

property pwn20synonyms¶

property pwn20paraphrase¶

property source¶

germanetpy.lexunit module¶

class germanetpy.lexunit.LexRel(*args: Any, **kwargs: Any)[source]¶

Bases: Enum

This enum represents the lexical relation (short: LexRel) that a Lexunit can have in GermaNet. You can find a description of each relation at: https://uni-tuebingen.de/en/142846

has_synonym = 'has_synonym'¶

has_antonym = 'has_antonym'¶

has_pertainym = 'has_pertainym'¶

has_participle = 'has_participle'¶

has_active_usage = 'has_active_usage'¶

has_occasion = 'has_occasion'¶

has_attribute = 'has_attribute'¶

has_appearance = 'has_appearance'¶

has_construction_method = 'has_construction_method'¶

has_container = 'has_container'¶

is_container_for = 'is_container_for'¶

has_consistency_of = 'has_consistency_of'¶

has_component = 'has_component'¶

has_owner = 'has_owner'¶

is_owner_of = 'is_owner_of'¶

has_function = 'has_function'¶

has_manner_of_functioning = 'has_manner_of_functioning'¶

has_origin = 'has_origin'¶

has_production_method = 'has_production_method'¶

has_content = 'has_content'¶

has_no_property = 'has_no_property'¶

has_habitat = 'has_habitat'¶

has_location = 'has_location'¶

is_location_of = 'is_location_of'¶

has_measure = 'has_measure'¶

is_measure_of = 'is_measure_of'¶

has_material = 'has_material'¶

has_member = 'has_member'¶

is_member_of = 'is_member_of'¶

has_diet = 'has_diet'¶

is_diet_of = 'is_diet_of'¶

has_eponym = 'has_eponym'¶

has_user = 'has_user'¶

has_product = 'has_product'¶

is_product_of = 'is_product_of'¶

has_prototypical_holder = 'has_prototypical_holder'¶

is_prototypical_holder_for = 'is_prototypical_holder_for'¶

has_prototypical_place_of_usage = 'has_prototypical_place_of_usage'¶

has_relation = 'has_relation'¶

has_raw_product = 'has_raw_product'¶

has_other_property = 'has_other_property'¶

is_storage_for = 'is_storage_for'¶

has_specialization = 'has_specialization'¶

has_part = 'has_part'¶

is_part_of = 'is_part_of'¶

has_topic = 'has_topic'¶

is_caused_by = 'is_caused_by'¶

is_cause_for = 'is_cause_for'¶

is_comparable_to = 'is_comparable_to'¶

has_usage = 'has_usage'¶

has_result_of_usage = 'has_result_of_usage'¶

has_purpose_of_usage = 'has_purpose_of_usage'¶

has_goods = 'has_goods'¶

has_time = 'has_time'¶

is_access_to = 'is_access_to'¶

has_ingredient = 'has_ingredient'¶

is_ingredient_of = 'is_ingredient_of'¶

class germanetpy.lexunit.OrthFormVariant(*args: Any, **kwargs: Any)[source]¶

Bases: Enum

This enum represents the four possible orthographical variations

orthForm = 'orthForm'¶

orthVar = 'orthVar'¶

oldOrthForm = 'oldOrthForm'¶

oldOrthVar = 'oldOrthVar'¶

class germanetpy.lexunit.Lexunit(id: str, synset, sense: int, source: str, named_entity: bool, style_marking: bool, artificial: bool, compound_info=None, orthform: str = None, old_orthform: str = None, orthvar: str = None, old_orthvar: str = None, particle: str = None, base_verb: str = None, comment: str = None)[source]¶

Bases: object

This class holds the lexical unit object of GermaNet. A lexical unit is a concrete word that is part of a synset.

get_orthform_variant(orthform_variant) → str[source]¶

Parameters:: orthform_variant (OrthFormVariant) – one of the four orthform_variants
Returns:: the string of the requested orthform variant or the main orthform, if the requested orthform doesn’t exist.

get_synonyms()[source]¶

get_related_lexunits(lexrel_type, direction='outgoing')[source]¶

get_all_orthforms() → set[source]¶

Returns:: A set of all existing orthform variants of the current lexunit.

property id¶

property synset¶

property sense¶

property orthform¶

property orthvar¶

property old_orthform¶

property old_orthvar¶

property particle¶

property base_verb¶

property comment¶

property frames¶

property examples¶

Returns:: The Example objects that belong to this lexical unit.

property ili_records¶

property frames2examples¶

Returns:: A mapping from example-frame strings to the Example objects that use them.

property wiktionary_paraphrases¶

property compound_info¶

property relations¶

property incoming_relations¶

property artificial¶

germanetpy.longest_shortest_path module¶

germanetpy.longest_shortest_path.get_overall_longest_shortest_distance(germanet, category) -> (<class 'dict'>, <class 'int'>)[source]¶

Iterate trough the synsets of a given wordcategory. For each synset, extract all possible hypernyms and compute the shortest possible distance to each hypernym. From these distances, also store the longest possible shortest distance.

Parameters:

germanet (Germanet) – the germanet graph
category (WordCategory) – the wordcategory

Returns:

a dictionary with each synset and its longest shortest distance, the overall longest shortest distance

germanetpy.longest_shortest_path.get_greatest_depth(germanet, category) → int[source]¶

Iterate trough the synsets of a given word category. For each synset check the depth and return the greatest depth that has been seen.

Parameters:

germanet (Germanet) – the germanet graph
category (WordCategory) – the wordcategory

Returns:

the greatest depth for a given word category. The depth of a synset is defined by the shortest path length between the synset and the root node

germanetpy.longest_shortest_path.get_longest_possible_shortest_distance(germanet, wordcategory)[source]¶

set a maxdistcounter = 0 for each synset: get the corresponding longest shortest distance. if this plus the overall longest shortest distance is smaller than maxdistance:

continue with the next synset

if it is larger:

go trough each synset and get the corresponding longest shortest distance. if this plus the longest shortest distance of the synset of interest is smaller than maxdistance:

continue

else:: compute the actual path distance and update the maxdistance if it is larger

Return type:

(int, int, tuple(Synset, Synset)

Parameters:

wordcategory (WordCategory) – the wordcategory for which this maxlen should be computed
germanet (Germanet) – the germanet graph

Returns:

the longest possible shortest distance between two synsets of a specified wordcategory, the maximum depth

of any synset (lenght to the root) and a Tuple with two synsets that have the longest shortest distance

germanetpy.longest_shortest_path.print_longest_shortest_distances(germanet, word_category)[source]¶

Computes and prints the longest shortest distances for the given word category.

germanetpy.longest_shortest_path.print_maximum_depths(germanet, word_category)[source]¶

Computes and prints the maximum depth for the given word_category.

germanetpy.path_based_relatedness_measures module¶

class germanetpy.path_based_relatedness_measures.PathBasedRelatedness(germanet, category, max_len: int = None, max_depth: int = None, synset_pair=None)[source]¶

Bases: object

These measures use the GermaNet Graph to compute the shortest Paths between two concepts. These concepts have to have the same word category. The path lengths are normalized in different ways (depending on the measure). The path lengths are computed taking only the hypernymy / hyponymy relations into account

simple_path(synset1, synset2, normalize: bool = False, normalized_max: float = 1.0) → float[source]¶

This measure computes the pathlength and normalizes it by the longest possible shortest path between any two nodes of the corresponding word category.

Parameters:

synset1 (Synset) – The source synset
synset2 (Synset) – The target synset the source synset is compared to
normalize – The relatedness value can be normalized to a number between the possible minimum of that measure and a given upper bound.
normalized_max – The upper bound of the range the measure is normalized to.

Returns:

: The normalized path length between two synsets

init_min_max_normalization_values(synset_pair)[source]¶

This methods computes the minimal values (two synsets are equal) and the maximum values (two synsets are maximally appart in the graph) for normalization

Parameters:: synset_pair – (Synset, Synset) The Tuple of synsets that have the maximum distance in the graph
Returns:: a dictionary [SemRelMeasure : (int, int)] containing the (minimum value, maximum value) for each semantic similarity measure.

wu_and_palmer(synset1, synset2, normalize: bool = False, normalized_max: float = 1.0) → float[source]¶

This methods computes the semantic relatedness by taking the path length into account, normalizing by taking the depth of the LCS. If there are several possible LCS, the one with the largest depth is taken into account.

Parameters:

synset1 (Synset) – The source synset
synset2 (Synset) – The target synset the source synset is compared to
normalize – The relatedness value can be normalized to a number between the possible minimum of that measure and a given upper bound.
normalized_max – The upper bound of the range the measure is normalized to.

Returns:

The wu and palmer relatedness measure

leacock_chodorow(synset1, synset2, normalize: bool = False, normalized_max: float = 1.0) → float[source]¶

This method implements the leackock and chodorow relatedness measure. For the path distance and depth, node count is used.

Parameters:

synset1 (Synset) – The source synset
synset2 (Synset) – The target synset the source synset is compared to
normalize – The relatedness value can be normalized to a number between the possible minimum of that measure and a given upper bound.
normalized_max – The upper bound of the range the measure is normalized to.

Return::

The leackock and chodorow relatedness measure

normalize(raw_value: float, normalized_max: float, semrel_measure: SemRelMeasure) → float[source]¶

Normalizes a raw value of semantic relatedness to a value between a lower bound and the given upper bound.

Parameters:

raw_value – The raw value
normalized_max – The upper bound
semrel_measure – The semantic relatedness measure, the value corresponds to.

Returns:

The normalized semantic relatedness value

property germanet¶

property max_len¶

property max_depth¶

property category¶

property normalization_dic¶

germanetpy.relationLoader module¶

germanetpy.relationLoader.get_relation_attributes(attributes) -> (<class 'str'>, <class 'str'>, <class 'str'>, <class 'str'>)[source]¶

Parameters:: attributes (XML attribute) – The XML attributes the information can be extracted from
Returns:: The information as Strings or None if the information is not present. The name of the relation,the id of the start node, the id of the end node, the type of direction and if the relation is inverse

germanetpy.relationLoader.load_relations(germanet, tree)[source]¶

Loads the information about the related synsets ans lexunits from the data and adds the edges between the objects.

Parameters:

germanet (Germanet) – The Germanet object that is populated with Synsets and Lexunits
tree (Element Tree) – The XML tree of the relation data.

germanetpy.semrel_measures module¶

class germanetpy.semrel_measures.SemRelMeasure(*args: Any, **kwargs: Any)[source]¶

Bases: Enum

This Enum represents the semantic relatedness measures

SimplePath = 'SimplePath'¶

LeacockAndChodorow = 'LeacockAndChodorow'¶

WuAndPalmer = 'WuAndPalmer'¶

Resnik = 'Resnik'¶

Lin = 'Lin'¶

JiangAndConrath = 'JiangAndConrath'¶

germanetpy.synset module¶

class germanetpy.synset.ConRel(*args: Any, **kwargs: Any)[source]¶

Bases: Enum

This Enum class contains the conceptual relations (short: ConRel) that synsets can have to other synsets. For a description of each relation look at https://uni-tuebingen.de/en/142846

has_hypernym = 1¶

has_hyponym = 2¶

has_component_meronym = 3¶

has_component_holonym = 4¶

has_member_meronym = 5¶

has_member_holonym = 6¶

has_substance_meronym = 7¶

has_substance_holonym = 8¶

has_portion_meronym = 9¶

has_portion_holonym = 10¶

entails = 11¶

is_entailed_by = 12¶

is_related_to = 13¶

causes = 14¶

static transitive(conrel) → bool[source]¶

Returns true if the conceptual relation is transitive, false otherwise

Parameters:: conrel (ConRel) – a conceptual relation
Returns:: true if the conceptual relation is transitive, false otherwise

class germanetpy.synset.WordCategory(*args: Any, **kwargs: Any)[source]¶

Bases: Enum

This Enum class contains the three part-of-speech tags (WortCategory), a Synset can have in GermaNet. adj = adjective, nomen = noun, verben = verb

adj = 1¶

nomen = 2¶

verben = 3¶

static get_possible_word_classes(word_category) → set[source]¶

Each wor category can only occur with a specific set of word classes.

Parameters:: word_category (WordCategory) – The word category
Returns:: The set of word classes that occur with the given word category

class germanetpy.synset.WordClass(*args: Any, **kwargs: Any)[source]¶

Bases: Enum

This Enum class contains the semantic wordclasses / semantic fields a Synset can have in GermaNet. For a detailed description see: http://www.sfs.uni-tuebingen.de/GermaNet/germanet_structure.shtml#Tops

Allgemein = 1¶

Bewegung = 2¶

Gefuehl = 3¶

Geist = 4¶

Gesellschaft = 5¶

Koerper = 6¶

Menge = 7¶

natPhaenomen = 8¶

Ort = 9¶

Pertonym = 10¶

Perzeption = 11¶

privativ = 12¶

Relation = 13¶

Substanz = 14¶

Verhalten = 15¶

Zeit = 16¶

Artefakt = 17¶

Attribut = 18¶

Besitz = 19¶

Form = 20¶

Geschehen = 21¶

Gruppe = 22¶

Kognition = 23¶

Kommunikation = 24¶

Mensch = 25¶

Motiv = 26¶

Nahrung = 27¶

natGegenstand = 28¶

Pflanze = 29¶

Tier = 30¶

Tops = 31¶

Koerperfunktion = 32¶

Konkurrenz = 33¶

Kontakt = 34¶

Lokation = 35¶

Schoepfung = 36¶

Veraenderung = 37¶

Verbrauch = 38¶

static get_possible_word_categories(word_class)[source]¶

Each word class can occur with one or several word categories.

Return type:: set(WordCategory)
Parameters:: word_class (WordClass) – the word class to get the possible word categories for
Returns:: the set of word categories the given word class can occur with

class germanetpy.synset.Synset(id: str, word_category: WordCategory, word_class: WordClass)[source]¶

Bases: object

This class holds a Synset object. A synset in GermaNet contains several lexical units and holds specific relations to other synsets, for example a synset can have hypernyms or hyponyms.

add_lexunit(unit)[source]¶

Adds a lexical unit that part of this synset to the list of lexical units

Parameters:: unit (Lexunit) – The lexUnit object to be added

is_root() → bool[source]¶

Returns:: True if this Synset is the root of the Graph (= has no hypernyms), otherwise false

is_leaf() → bool[source]¶

Returns:: True if this Synset is a leaf of the Graph (= has no hyponyms), otherwise false

num_lexunits() → int[source]¶

Returns:: The number of lexical units, contained in that synset

get_related_synsets(conrel_type, direction='outgoing')[source]¶

hypernym_paths() → list[source]¶

This method iterates recursively through the hypernyms of this synset to get all paths that connect this synset with the root node. a path is complete if it ends with the root node. all possible paths are returned. each path is a list of nodes.

Returns:: A list of lists, each lists contains a node sequence connecting this synset with the root node

all_hypernyms() → set[source]¶

This method extracts all hypernyms for this synset (the transitive closure for this synset)

Returns:: a set, containing all possible hypernym nodes. it is empty if the current synset is the root node

hyponym_paths() → list[source]¶

This method iterates recursively through the hyponyms of this synset to get all paths that connect this synset with a leaf node. A path is complete if it ends with a leaf node. All possible paths are returned. Each path is a list of nodes.

Returns:: A list of lists, each lists contains a node sequence connecting this synset with a leaf node

all_hyponyms() → set[source]¶

This method returns all possible hyponyms of this synset.

Returns:: [set(Synset)] A set of synset nodes, each constitutes a hyponym of the current synset.

shortest_path_to_root() → list[source]¶

This method returns the shortest path to the root node.

Returns:: [list(Synset)] shortest path to the root node.

common_hypernyms(other) → set[source]¶

Given another synset, this method computes shared hypernyms

Parameters:: other (Synset) – another synset object
Returns:: a set of synset nodes, that denotes the shared hypernyms between this synset and the given one.

min_depth() → int[source]¶

Returns:: The length of the shortest hypernym path from this synset to the root.

shortest_path_distance(other) → int[source]¶

Returns the distance of the shortest path linking the two synsets (if one exists). If a node is compared with itself 0 is returned. The distance is denoted by the number of edges that exist in the shortest path.

Parameters:: other (Synset) – The Synset to which the shortest path will be found.
Returns:: The number of edges in the shortest path connecting the two nodes, or None if no path exists.

shortest_path(other) → list[source]¶

Returns the shortest possible sequence of synset nodes that are traversed from this synset to a given other synset. If there are several shortest sequences, all of then are returned.

Parameters:: other (Synset) – A synset the path should be computed to
Returns:: A list of lists, each list containing the sequence of nodes traversed from this synset to the given other synset.

shortest_path_to_hypernym(hypernym) → list[source]¶

The shortest path between this synset and the given hypernym. Asserts that the given other synset is a real hypernym of the current synset.

Parameters:: hypernym (Synset) – a synset, denoting the hypernym the shortest path should be computed to
Returns:: a list of lists, each list storing the shortest sequence of synset nodes traversed from self to the given hypernym

lowest_common_subsumer(other) → set[source]¶

Extract the lowes common subsumer(s) / lowest common ancestor(s) of the current synset and a given one.

Parameters:: other (Synset) – Another synset object the LCS should be computed to.
Returns:: a set, containing one or several synset objects, being the LCS between the current synset and the given one.

get_distances_hypernym_dic() → dict[source]¶

For each hypernym, store the shortest distance between the current synset and its hypernym.

Returns:: A dictionary containing all hypernyms of this synset as keys and the corresponding distances as values.

property id¶

property word_category¶

property word_class¶

property paraphrase¶

property lexunits¶

property relations¶

property incoming_relations¶

property direct_hypernyms¶

property direct_hyponyms¶

germanetpy.synsetLoader module¶

germanetpy.synsetLoader.get_attribute_element(attributes, element: str, enum)[source]¶: Constructs an Emum object of a given attribute :rtype: FastEnum :type enum: FastEnum :type attributes: XML attributes :param attributes: XML attributes of a certain XML node :param elment: A String :param enum: The Enum object that should be initialized :return: The corresponding Enum object or None

germanetpy.synsetLoader.get_attribute_element_without_enum(attributes, element: str)[source]¶: Returns attribute value if attribute exists :type attributes: XML attributes :param attributes: XML attributes of a certain XML node :param elment: A String :return: The corresponding object or None

germanetpy.synsetLoader.create_compound_info(child) → CompoundInfo[source]¶: Creates a compound info object. This has a modifier (String) and a head (String). Each modifier and the head can have a property (CompoundProperty) and a category (CompoundCategory). :param child: the XML element :return: A CompoundInfo object

germanetpy.synsetLoader.load_lexunits(germanet, tree)[source]¶: Takes the XML tree and walks trough it to create the Lexunit objects. :type tree: Element Tree :type germanet: Germanet :param germanet: the germanet object :param tree: XML tree

germanetpy.synsetLoader.convert_example_edited(value: str) → bool[source]¶

Converts an example edited value to a boolean.

The example edited tag is element text in the XML data. The Java API uses Boolean.parseBoolean, so true/false are supported here. yes/no and 1/0 are accepted as well to be tolerant of existing GermaNet boolean conventions.

germanetpy.synsetLoader.create_example(example_root) → Example[source]¶

Creates an Example object from an XML example element.

Parameters:: example_root – The XML example element.
Returns:: An Example object containing text, optional frame, and optional LLM metadata.

germanetpy.synsetLoader.create_lexunit(germanet, attributes, lex_root, synset) → Lexunit[source]¶: Given the XML data, creates a Lexunit object. :type attributes: XML attributes :type germanet: Germanet :param germanet: The germanet object. :param attributes: The XML attributes. :param lex_root: The XML root :param synset: the corresponding synset object :return: a lexical unit object

germanetpy.synsetLoader.add_orth_forms(germanet, lexunit: Lexunit, child_value: str, tag: str)[source]¶

Checks which orthform the tag contains, and adds it to the lexunit object. Adds the lexunit id to the corresponding dictionary.

Parameters:

germanet (Germanet) – The germanet object containing the Orthform variant dictionaries.
lexunit – the Lexunit object the Orthform variant needs to be added to
child_value – the value of the XML element that contains this Orthform variant
tag – the value of the XML tag specifying the type of Orthform variant

germanetpy.utils module¶

germanetpy.utils.convert_to_boolean(attribute: str) → bool[source]¶

Converts the given String into a boolean.

Parameters:: attribute – The attribute that needs to be converted into a boolean
Returns:: True, False or an Error message if the attribute doesn’t have the right value

germanetpy.utils.parse_xml(datadir: str, f: str) → lxml.etree[source]¶

Parses an XML file and returns the XML tree

Parameters:

datadir – The directory where the file is located
f – the filename

Returns:

The parsed XML tree

germanetpy.wiktionaryLoader module¶

germanetpy.wiktionaryLoader.create_wiktionary(attributes) → WiktionaryParaphrase[source]¶

Creates a wiktionary object given the XML attributes that contain the required information

Parameters:: attributes – XML attributes that contain information about the wiktionary paraphrase
Returns:: a wiktionary object

germanetpy.wiktionaryLoader.load_wiktionary(germanet, tree)[source]¶

Given a XML tree this method initialized the wiktionary objects and adds them to the germanet object and the corresponding lexunits

Parameters:

germanet (Germanet) – The germane object
tree (etree) – The XML tree of the wiktionary file

germanetpy.wiktionaryparaphrase module¶

class germanetpy.wiktionaryparaphrase.WiktionaryParaphrase(lexunit_id: str, wiktionary_id: str, wiktionary_sense_id: int, wiktionary_sense: str, edited: bool)[source]¶

Bases: object

property lexunit_id¶

property wiktionary_id¶

property wiktionary_sense_id¶

property wiktionary_sense¶

property edited¶

germanetpy package¶

Submodules¶

germanetpy.compoundInfo module¶

germanetpy.filterconfig module¶

germanetpy.frames module¶

germanetpy.germanet module¶

germanetpy.icbased_similarity module¶

germanetpy.iliLoader module¶

germanetpy.iliRecord module¶

germanetpy.lexunit module¶

germanetpy.longest_shortest_path module¶

germanetpy.path_based_relatedness_measures module¶

germanetpy.relationLoader module¶

germanetpy.semrel_measures module¶

germanetpy.synset module¶

germanetpy.synsetLoader module¶

germanetpy.utils module¶

germanetpy.wiktionaryLoader module¶

germanetpy.wiktionaryparaphrase module¶

Module contents¶

germanetpy

Navigation