[mindlog commit] r171 - trunk

1 view
Skip to first unread message

codesite...@google.com

unread,
Dec 20, 2008, 12:28:35 PM12/20/08
to mindl...@googlegroups.com
Author: klaus....@cobss.com
Date: Sat Dec 20 08:26:12 2008
New Revision: 171

Modified:
trunk/Mindlog-WordNet-kwl.1.cs

Log:
test for Google User Defect Report

Modified: trunk/Mindlog-WordNet-kwl.1.cs
==============================================================================
--- trunk/Mindlog-WordNet-kwl.1.cs (original)
+++ trunk/Mindlog-WordNet-kwl.1.cs Sat Dec 20 08:26:12 2008
@@ -1 +1 @@
-'From Squeak3.10.2 of ''5 June 2008'' [latest update: #7179] on 20
December 2008 at 10:15:48 am'!
"Change Set: Mindlog-WordNet
Date: 20 December 2008
Author: Klaus D. Witzel

<project home: http://code.google.com/p/mindlog/
code license: http://www.opensource.org/licenses/mit-license.php
content license: http://creativecommons.org/licenses/by-sa/3.0/
developed on platform: http://www.squeak.org/Download/
feedback and issues: http://code.google.com/p/mindlog/issues/list
project discussion: http://groups.google.com/group/mindlog-dev/topics

installation and use:

1 download WNprolog-3.0.tar.gz from http://wordnet.princeton.edu/obtain
2 untar the downloaded into a directory, gives files with .pl (Prolog)
ending
3 fileIn this change set, it will load the .pl files (1-2 minutes)
4 during load it prints # of items found, per file, in the Transcript
5 with SUnit run WordNetTests (method #testIntegrity takes ca. 30 seconds)
6 you can now access synsets by lemma,
7 see method #synsetsAt: in the public interface
>"!

LookupKey variableSubclass: #WordNetLookupKey
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetLookupKey commentStamp: 'kwl 11/13/2008 11:26' prior: 0!
Instances of class WordNetLookupKey represent external keys of WordNet's
semantic relations and lemma's senses.

My 'key' is WordNet's lemma at its surface symbol, my indexable fields
store subinstances of WordNetsSynset.!

WordNetLookupKey variableSubclass: #WordNetsSynset
instanceVariableNames: 'arcs'
classVariableNames: 'AdjectiveIndex AdverbIndex AntonymIndex
AttributeIndex CauseIndex ClauseIndex ClusterIndex ColligationIndex
DerivationIndex EntailIndex InstanceIndex KindIndex MaskSynsetBits
MaskSynsetShift MemberIndex NounIndex PartIndex PertainsIndex RegionIndex
RuleOutIndex ScriptIndex SubstanceIndex TopicIndex UsageIndex VerbIndex'
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetsSynset commentStamp: 'kwl 11/20/2008 08:43' prior: 0!
Subinstances of class WordNetsSynset represent WordNet lemmas' senses and
their semantic relations.

Their 'key' field is WordNet's synset_id (syntactic category appended),
their 'arcs' field represents semantic relations (a collection of two-word
bit-vectors indexing my subinstances and their lemmas).

The indexable fields store the lemma's surface symbol, indexing is by
WordNet's lemma (aka word) number.

Break-down of the two-word bit-vector which represents a relation (viewed
as an arc in a graph):

48:8 here/this word's number/index
40:8 there/that word's number/index
31:8 relation's index number
23:23 synset index number (syntactic category appended)

During initial load of WordNet's lex db, their synset_id's are renumbered
and their syntactic category is appended.

The two synset numers are associated at the application level, for example
the index for the think_of verb is associated with WordNet3.0's 200723222
synset_id. Associations in use are

105764197 colligation
105820620 example, illustration, instance, representative
105849040 property, attribute, dimension
106288024 antonym, opposite_word, opposite
106290051 derivation
106292478 holonym, whole_name
106292836 hypernym, superordinate, superordinate_word
106292973 hyponym, subordinate, subordinate_word
106293746 meronym, part_name
106303682 synonym, equivalent_word
106304425 troponym, manner_name
106314144 clause
106317862 noun
106318062 verb
106319029 adjective
106319157 adverb
106322357 pertainym
107959943 bunch, clump, cluster, clustering
107997703 class. category. family
200723222 think_of
201147562 rule_out, rule_in
201756719 script
202634808 entail, implicate
!

WordNetsSynset variableSubclass: #WordNetAdjectiveSynset
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetAdjectiveSynset commentStamp: 'kwl 11/13/2008 11:25' prior: 0!
See description in my superclass.!

WordNetsSynset variableSubclass: #WordNetAdverbSynset
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetAdverbSynset commentStamp: 'kwl 11/13/2008 11:25' prior: 0!
See description in my superclass.!

WordNetsSynset variableSubclass: #WordNetNounSynset
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetNounSynset commentStamp: 'kwl 11/13/2008 11:24' prior: 0!
See description in my superclass.!

WordNetsSynset variableSubclass: #WordNetVerbSynset
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetVerbSynset commentStamp: 'kwl 11/13/2008 11:24' prior: 0!
See description in my superclass.!

WordNetsSynset variableSubclass: #WordRootNet
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordRootNet commentStamp: 'kwl 11/20/2008 09:31' prior: 0!
An instance of class WordRootNet holds the root of WordNet's hypergraph.

Indexable fields hold dictionaries for accessing nouns, verbs, adjectives
and adverbs by synset_id.

Field 'arcs' holds the binary searchable collection of synsets, with one
entry per lemma (word) of the hypergraph.

A supplemental indexed field holds the collection of primitive realtions
(think_of, kind_of, opposite_of, etc).
!

!WordNetLookupKey methodsFor: 'accessing' stamp: 'kwl 11/13/2008 15:22'!
adjectives
"answer a collection of the receiver's adjectives"
| answer synset |
answer := OrderedCollection new: self basicSize.
1 to: self basicSize do: [:wNum |
((synset := self basicAt: wNum) isMemberOf: WordNetAdjectiveSynset)
ifTrue: [answer add: synset]
].
^ answer! !

!WordNetLookupKey methodsFor: 'accessing' stamp: 'kwl 11/13/2008 15:22'!
adverbs
"answer a collection of the receiver's adverbs"
| answer synset |
answer := OrderedCollection new: self basicSize.
1 to: self basicSize do: [:wNum |
((synset := self basicAt: wNum) isMemberOf: WordNetAdverbSynset)
ifTrue: [answer add: synset]
].
^ answer! !

!WordNetLookupKey methodsFor: 'accessing' stamp: 'kwl 11/13/2008 15:23'!
nouns
"answer a collection of the receiver's nouns"
| answer synset |
answer := OrderedCollection new: self basicSize.
1 to: self basicSize do: [:wNum |
((synset := self basicAt: wNum) isMemberOf: WordNetNounSynset)
ifTrue: [answer add: synset]
].
^ answer! !

!WordNetLookupKey methodsFor: 'accessing' stamp: 'kwl 11/13/2008 15:21'!
verbs
"answer a collection of the receiver's verbs"
| answer synset |
answer := OrderedCollection new: self basicSize.
1 to: self basicSize do: [:wNum |
((synset := self basicAt: wNum) isMemberOf: WordNetVerbSynset)
ifTrue: [answer add: synset]
].
^ answer! !

!WordNetLookupKey methodsFor: 'private' stamp: 'kwl 11/14/2008 13:39'!
key: anObject
1 to: self basicSize do: [:wNum |
(self basicAt: wNum) ifNotNilDo: [:you | you shareKey: anObject]].
^ super key: anObject! !

TestCase subclass: #WordNetTests
instanceVariableNames: 'wordNet'
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNetTests'!

!WordNetTests commentStamp: 'kwl 11/19/2008 08:20' prior: 0!
Tests for the integrity of the WordNet db and the public interface.!


!WordNetTests methodsFor: 'running' stamp: 'kwl 11/19/2008 08:09'!
setUp
wordNet := nil environment associationAt: #WordNet30! !

!WordNetTests methodsFor: 'running' stamp: 'kwl 11/19/2008 08:09'!
tearDown
wordNet := nil! !

!WordNetTests methodsFor: 'testing - integrity' stamp: 'kwl 11/19/2008
20:56'!
testIntegrity
| relationsIx inverseYz here other count |
(1 to: wordNet basicSize -1) with: #('noun' 'verb' 'adjective' 'adverb')
do: [:pos :mfc |
'testing integrity of ', mfc, ' relations'
displayProgressAt: Sensor cursorPoint from: (count := 0) to: (other :=
wordNet basicAt: pos) size during: [:bar |
other associationsDo: [:synset |
bar value: (count := count +1).
inverseYz := (relationsIx := synset relationIndices) collect: [:each |
wordNet inverseRelIdOf: each].
here := synset key.
relationsIx with: inverseYz do: [:iX :jY |
(synset neighboursAt: iX) do: [:neighbour |
other := wordNet senseAt: neighbour.
self shouldnt: [other indexOfRelation: jY of: here]
raise: Error
]
]
]
]
]! !


!WordNetsSynset methodsFor: 'accessing' stamp: 'kwl 11/12/2008 16:36'!
partOfSpeech
"answer the receiver's part of speech"
^ #'think_of'! !

!WordNetsSynset methodsFor: 'accessing-relations' stamp: 'kwl 11/20/2008
08:14'!
indexOfRelation: relId of: synsetId
"answer the index of my arc {wNum,relId,wNum,synsetId}
or raise an error if no such relation"
| secondVector |
arcs ifNil: [^ self error: 'relation not found'].
secondVector := (relId bitShift: MaskSynsetShift) + synsetId.
2 to: arcs size by: 2 do: [:index | ((arcs at: index) = secondVector)
ifTrue: [^ index ]].
^ self error: 'relation not found'! !

!WordNetsSynset methodsFor: 'accessing-relations' stamp: 'kwl 11/20/2008
08:19'!
inverseRelIdOf: relId
"answer the index of the inverse relation to
the argument, or the argument if idempotent"
| implemented relation inverse answer |
relation := (implemented := self basicAt: self basicSize) at: relId.
answer := relId even ifTrue: [relId -1] ifFalse: [relId +1].
inverse := implemented at: answer.
1 to: relation basicSize do: [:wNum |
(inverse basicAt: wNum) = (relation basicAt: wNum)
ifFalse: [^ answer]
].
^ relId! !

!WordNetsSynset methodsFor: 'accessing-relations' stamp: 'kwl 11/20/2008
08:24'!
neighboursAt: relId
"answer a collection of the receiver's neighbours id's under relId"
| answer negShift here |
arcs ifNil: [^ #() ].
answer := OrderedCollection new.
negShift := 0 - MaskSynsetShift.
2 to: arcs size by: 2 do: [:index |
((here := arcs at: index) bitShift: negShift) = relId
ifTrue: [answer add: (here bitAnd: MaskSynsetBits)].
].
^ answer! !

!WordNetsSynset methodsFor: 'accessing-relations' stamp: 'kwl 11/20/2008
08:33'!
relationIndices
"answer a collection of the receiver's relation indices"
| answer negShift |
arcs ifNil: [^ #() ].
answer := IdentitySet new.
negShift := 0 - MaskSynsetShift.
2 to: arcs size by: 2 do: [:index |
answer add: ((arcs at: index) bitShift: negShift).
].
^ answer asArray
! !

!WordNetsSynset methodsFor: 'printing' stamp: 'kwl 11/16/2008 15:14'!
printOn: aStream

aStream print: key; space; nextPutAll: self partOfSpeech; space; nextPut:
${.
1 to: self basicSize -1 do: [:ix | aStream print: (self basicAt: ix);
space].
aStream print: (self basicAt: self basicSize); nextPut: $}! !

!WordNetsSynset methodsFor: 'private' stamp: 'kwl 11/20/2008 08:02'!
at: wNumX of: relIdX put: wNumY of: senseIdY
"this is part of class initialization"
| firstVector secondVector wordArray |
arcs isInteger ifTrue: [WordArray streamContents: [:arcStream |
arcStream nextPut: arcs.
arcs := arcStream]
].
firstVector := (wNumX bitShift: 8) + wNumY.
self assert: [((secondVector := (relIdX bitShift: MaskSynsetShift) +
senseIdY) bitAnd: MaskSynsetBits) = senseIdY]
description: ['cannot compact 2nd vector'].
wordArray := arcs braceArray.
3 to: arcs position by: 2 do: [:index | ((wordArray at: index) =
secondVector)
ifTrue: ["Princetonian duplicate" ^ self]].
arcs
nextPut: firstVector;
nextPut: secondVector! !

!WordNetsSynset methodsFor: 'private' stamp: 'kwl 11/19/2008 20:33'!
clampArcStreams
arcs ifNil: [^ self].
arcs isStream ifTrue: [
key := arcs braceArray first.
^ arcs := arcs braceArray copyFrom: 2 to: arcs position].
arcs isInteger ifTrue: [key := arcs. ^ arcs := nil].
1 to: self basicSize -1 do: [:pos |
(self basicAt: pos)
valuesDo: [:synset | synset clampArcStreams];
rehash
]! !

!WordNetsSynset methodsFor: 'private' stamp: 'kwl 11/13/2008 11:47'!
key: aKey arcs: anObject
arcs := anObject.
^ super key: aKey! !

!WordNetsSynset methodsFor: 'private' stamp: 'kwl 11/19/2008 20:12'!
newKey
"this is part of class initialization"
^ arcs isInteger
ifTrue: [arcs]
ifFalse: [arcs braceArray first]! !

!WordNetsSynset methodsFor: 'private' stamp: 'kwl 11/12/2008 21:15'!
shareKey: anObject
1 to: self basicSize do: [:wNum |
(self basicAt: wNum) = anObject
ifTrue: [^ self basicAt: wNum put: anObject]
]! !


!WordNetAdjectiveSynset methodsFor: 'accessing' stamp: 'kwl 11/16/2008
15:11'!
partOfSpeech
"answer the receiver's part of speech"
^ #adjective! !


!WordNetAdverbSynset methodsFor: 'accessing' stamp: 'kwl 11/16/2008 15:12'!
partOfSpeech
"answer the receiver's part of speech"
^ #adverb! !


!WordNetNounSynset methodsFor: 'accessing' stamp: 'kwl 11/16/2008 15:12'!
partOfSpeech
"answer the receiver's part of speech"
^ #noun! !


!WordNetVerbSynset methodsFor: 'accessing' stamp: 'kwl 11/16/2008 15:12'!
partOfSpeech
"answer the receiver's part of speech"
^ #verb! !


!WordNetsSynset class methodsFor: 'class initialization' stamp: 'kwl
11/19/2008 20:35'!
initialize
"WordNetsSynset initialize"
| lexEntries slash localDirectory fileStream fileName doSkip temp lexDict |

MaskSynsetBits := (1 bitShift: (MaskSynsetShift := 23)) -1.

lexEntries := {IdentityDictionary new. IdentityDictionary new.
IdentityDictionary new. IdentityDictionary new}.
slash := FileDirectory default pathNameDelimiter asString.
localDirectory := '..' , slash, 'WNprolog3.0' , slash.
fileStream := FileDirectory default readOnlyFileNamed: localDirectory,
(fileName := 'wn_sk.pl').
localDirectory := '.' , slash.
doSkip := [fileStream upTo: $.. [(temp := fileStream peek) notNil and:
[temp isSeparator]]
whileTrue: [fileStream next]].
temp := localDirectory, fileStream directory localName, slash, fileName.
'sense keys ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries senseKeys: fileStream progress: bar with:
doSkip.
].
lexDict := (temp := Array new: (lexEntries inject: 1 into: [:sum :dict |
sum + dict size]) * 3 >> 1) writeStream.
'lexical dictionary ' displayProgressAt: Sensor cursorPoint
from: 0 to: temp size
during: [:bar |
self initialize: lexEntries dictionary: lexDict progress: bar.
].
lexEntries := lexEntries, {OrderedCollection new. lexDict contents}.
self initializeMore: lexEntries.
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_ant.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'antonyms ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries antonyms: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_at.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'attributes ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries attributes: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_cls.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'categories ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries categories: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_cs.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'causes ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries causes: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_der.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'derivationals ', localDirectory,fileName displayProgressAt: Sensor
cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries derivationals: fileStream progress: bar
with: doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_ent.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'entailments ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries entailments: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_hyp.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'hypernyms ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries hypernyms: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_ins.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'instances ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries instances: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_mm.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'members ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries members: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_mp.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'parts ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries parts: fileStream progress: bar with: doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_per.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'pertainyms ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries pertainyms: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_sim.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'similarities ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries similarities: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_ms.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'substances ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries substances: fileStream progress: bar with:
doSkip.
].
nil environment removeKey: (temp := #WordNet30) ifAbsent: nil.
temp := nil environment add: ((WordRootNet basicNew: lexEntries size -1)
key: temp arcs: lexEntries last).
lexEntries allButLast withIndexDo: [:dict :index | temp basicAt: index
put: dict].
temp clampArcStreams
! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/13/2008 14:27'!
adapt: synsetClone at: pos from: fromWord to: toWord
synsetClone assert: [(synsetClone basicAt: pos) = fromWord]
description: ['could not find the ', fromWord, ' synset #', synsetClone
key printString].
^ synsetClone basicAt: pos put: toWord; yourself! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:13'!
initialize: lexEntries antonyms: fileStream progress: indicator with: doSkip
| posX senseX wNumX posY senseY wNumY synsetX synsetY count |
AntonymIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6288024) clone at: 1
from: 'antonym' to: 'opposite_of');
add: (self adapt: (lexEntries first at: 6288024) clone at: 1
from: 'antonym' to: 'opposite_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
wNumX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $,) readStream.
wNumY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: wNumX of: AntonymIndex put: wNumY of: synsetY newKey.
synsetY at: wNumY of: AntonymIndex put: wNumX of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:13'!
initialize: lexEntries attributes: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY whichIndex count |
AttributeIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 5849040) clone at: 2
from: 'attribute' to: 'attribute_to');
add: (self adapt: (lexEntries first at: 5849040) clone at: 2
from: 'attribute' to: 'attribute_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
(synsetY isMemberOf: WordNetAdjectiveSynset)
ifTrue: [whichIndex := AttributeIndex -1]
ifFalse: [whichIndex := AttributeIndex].
synsetX at: 0 of: whichIndex put: 0 of: synsetY newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries categories: fileStream progress: indicator with:
doSkip
| posX senseX wNumX posY senseY wNumY categoryIndices synsetX synsetY
whichRelation count |
TopicIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'topic_to');
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'topic_of'); size.
UsageIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'usage_to');
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'usage_of'); size.
RegionIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'region_to');
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'region_of'); size.
categoryIndices := {TopicIndex. UsageIndex. RegionIndex}.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
wNumX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $,) readStream.
wNumY := Integer readFrom: (fileStream upTo: $,) readStream.
synsetY := (lexEntries at: posY) at: senseY.
whichRelation := categoryIndices at: (#($t $u $r) indexOf: fileStream
peek).
synsetX at: wNumX of: whichRelation put: wNumY of: synsetY newKey.
synsetY at: wNumY of: whichRelation -1 put: wNumX of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries causes: fileStream progress: indicator with: doSkip
| posX senseX posY senseY synsetX synsetY count |
CauseIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 7326557) clone at: 1
from: 'cause' to: 'cause_to');
add: (self adapt: (lexEntries first at: 7326557) clone at: 1
from: 'cause' to: 'cause_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: CauseIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: CauseIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries derivationals: fileStream progress: indicator with:
doSkip
| posX senseX wNumX posY senseY wNumY synsetX synsetY count |
DerivationIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 13462387) clone at: 1
from: 'derivation' to: 'derivation_to');
add: (self adapt: (lexEntries first at: 13462387) clone at: 1
from: 'derivation' to: 'derivation_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
wNumX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $,) readStream.
wNumY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: wNumX of: DerivationIndex put: wNumY of: synsetY newKey.
synsetY at: wNumY of: DerivationIndex -1 put: wNumX of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/14/2008 14:49'!
initialize: lexEntries dictionary: lexDict progress: bar
"WordNetsSynset initialize"
| posHint senseHint count wordStream surface index temp wordSet |
count := 0.
lexEntries withIndexDo: [:dict :pos |
posHint := #('n' 'v' 'j' 'w') at: pos.
dict keysAndValuesDo: [:senseId :synset |
senseHint := senseId printString.
1 to: synset basicSize do: [:wNum |
lexDict nextPut: (synset basicAt: wNum), ' ', posHint, senseHint
]
].
bar value: (count := count + dict size).
].
wordStream := lexDict contents asSortedCollection.
bar value: (count := 0).
surface := nil.
lexDict reset.
wordStream do: [:sortKey |
index := sortKey size.
[(sortKey at: (index := index -1)) isDigit] whileTrue.
posHint := #($n $v $j $w) indexOf: (sortKey at: index).
senseHint := Integer readFrom: (sortKey readStream position: index;
yourself).
temp := sortKey copyFrom: 1 to: index -2.
surface
ifNil: [surface := temp. wordStream := (Array new: 31) writeStream].
temp = surface
ifFalse: [wordSet := superclass basicNew: (index := wordStream contents)
size.
index withIndexDo: [:synset :wNum | wordSet basicAt: wNum put: synset].
lexDict nextPut: (wordSet key: surface; yourself).
bar value: (count := count + 1).
surface := temp. wordStream reset
].
wordStream nextPut: ((lexEntries at: posHint) at: senseHint)
].
wordSet := superclass basicNew: (index := wordStream contents) size.
index withIndexDo: [:synset :wNum | wordSet basicAt: wNum put: synset].
lexDict nextPut: (wordSet key: surface; yourself).
bar value: (count := count + 1).
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries entailments: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY count |
EntailIndex := lexEntries fifth
add: (self adapt: (lexEntries second at: 2634808) clone at: 1
from: 'entail' to: 'entail_to');
add: (self adapt: (lexEntries second at: 2634808) clone at: 1
from: 'entail' to: 'entail_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: EntailIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: EntailIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries hypernyms: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY count |
KindIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6292836) clone at: 1
from: 'hypernym' to: 'kind_to');
add: (self adapt: (lexEntries first at: 6292836) clone at: 1
from: 'hypernym' to: 'kind_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: KindIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: KindIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries instances: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY count |
InstanceIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 5820620) clone at: 3
from: 'instance' to: 'instance_to');
add: (self adapt: (lexEntries first at: 5820620) clone at: 3
from: 'instance' to: 'instance_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: InstanceIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: InstanceIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries members: fileStream progress: indicator with: doSkip
| posX senseX posY senseY synsetX synsetY count |
MemberIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'member_to');
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'member_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: MemberIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: MemberIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:15'!
initialize: lexEntries parts: fileStream progress: indicator with: doSkip
| posX senseX posY senseY synsetX synsetY count |
PartIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'part_to');
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'part_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: PartIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: PartIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:15'!
initialize: lexEntries pertainyms: fileStream progress: indicator with:
doSkip
| posX senseX wNumX posY senseY wNumY synsetX synsetY count |
PertainsIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6322357) clone at: 1
from: 'pertainym' to: 'pertains_to');
add: (self adapt: (lexEntries first at: 6322357) clone at: 1
from: 'pertainym' to: 'pertains_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
wNumX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $,) readStream.
wNumY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: wNumX of: PertainsIndex put: wNumY of: synsetY newKey.
synsetY at: wNumY of: PertainsIndex -1 put: wNumX of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 19:57'!
initialize: lexEntries senseKeys: fileStream progress: indicator with:
doSkip
"WordNetsSynset initialize"
| temp posK senseK wNumK surface other newIndex |
[fileStream atEnd]
whileFalse:
[posK := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseK := Integer readFrom: (fileStream upTo: $,) readStream.
wNumK := Integer readFrom: (fileStream upTo: $,) readStream.
surface := (String readFrom: (fileStream upTo: $)) readStream) readStream
upTo: $%.
temp := lexEntries at: posK.
posK := temp at: senseK ifAbsent: [temp at: senseK put: OrderedCollection
new].
[posK size < wNumK] whileTrue: [posK addLast: wNumK].
posK at: wNumK put: surface.
doSkip value.
indicator value: fileStream position].
fileStream close.
temp := lexEntries inject: 1 into: [:size :dict | size + dict size].
other := (Array new: temp) writeStream.
surface := (Array new: temp) writeStream.
wNumK := {WordNetNounSynset. WordNetVerbSynset. WordNetAdjectiveSynset.
WordNetAdverbSynset}.
newIndex := (posK := 0).
lexEntries with: wNumK do: [:dict :subclass |
posK := posK +1.
dict associationsDo: [:assoc |
temp := subclass basicNew: (senseK := assoc value) size.
temp key: assoc key arcs: ((newIndex := newIndex +1) bitShift: 3) + posK.
senseK withIndexDo: [:word :ix | temp basicAt: ix put: word].
other nextPut: assoc.
surface nextPut: temp.
]].
other contents elementsExchangeIdentityWith: surface contents.
lexEntries do: [:dict | dict rehash]! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:15'!
initialize: lexEntries similarities: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY count |
ClusterIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6322357) clone at: 1
from: 'pertainym' to: 'cluster_to');
add: (self adapt: (lexEntries first at: 6322357) clone at: 1
from: 'pertainym' to: 'cluster_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: ClusterIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: ClusterIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:15'!
initialize: lexEntries substances: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY count |
SubstanceIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'substance_to');
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'substance_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: SubstanceIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: SubstanceIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 06:47'!
initializeMore: lexEntries
RuleOutIndex := lexEntries fifth
add: (self adapt: (lexEntries second at: 1147562) clone at: 1
from: 'rule_out' to: 'rule_out');
add: (self adapt: (lexEntries second at: 1147562) clone at: 2
from: 'rule_in' to: 'rule_in'); size.
NounIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6317862) clone at: 1 from: 'noun'
to: 'non-noun');
add: (self adapt: (lexEntries first at: 6317862) clone at: 1 from: 'noun'
to: 'noun'); size.
VerbIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6318062) clone at: 1 from: 'verb'
to: 'non-verb');
add: (self adapt: (lexEntries first at: 6318062) clone at: 1 from: 'verb'
to: 'verb'); size.
AdjectiveIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6319029) clone at: 1
from: 'adjective' to: 'non-adjective');
add: (self adapt: (lexEntries first at: 6319029) clone at: 1
from: 'adjective' to: 'adjective'); size.
AdverbIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6319157) clone at: 1
from: 'adverb' to: 'non-adverb');
add: (self adapt: (lexEntries first at: 6319157) clone at: 1
from: 'adverb' to: 'adverb'); size.
ColligationIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 5764197) clone at: 1
from: 'colligation' to: 'colligation_to');
add: (self adapt: (lexEntries first at: 5764197) clone at: 1
from: 'colligation' to: 'colligation_of'); size.
ClauseIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6314144) clone at: 1
from: 'clause' to: 'clause_to');
add: (self adapt: (lexEntries first at: 6314144) clone at: 1
from: 'clause' to: 'clause_of'); size.
ScriptIndex := lexEntries fifth
add: (self adapt: (lexEntries second at: 1756719) clone at: 1
from: 'script' to: 'script_to');
add: (self adapt: (lexEntries second at: 1756719) clone at: 1
from: 'script' to: 'script_of'); size! !


!WordRootNet methodsFor: 'accessing' stamp: 'kwl 11/19/2008 20:45'!
senseAt: wNumSynsetId
"given the argument answer a synset"
| synsetId |
synsetId := wNumSynsetId bitAnd: MaskSynsetBits.
^ (self basicAt: (synsetId bitAnd: 7)) at: synsetId! !

!WordRootNet methodsFor: 'accessing' stamp: 'kwl 11/14/2008 14:39'!
synsetsAt: surface
"answer synsets which contain the argument, a lemma"
| asciiOrder |
asciiOrder := String classPool at: #AsciiOrder.
^ arcs findBinary: [:synsets |
2 - (surface class compare: synsets key with: surface collated:
asciiOrder)] ifNone: nil! !


!WordRootNet class methodsFor: 'public interface' stamp: 'kwl 11/14/2008
15:42'!
synsetsAt: surface
"answer synsets which contain the argument, a lemma"
^ (nil environment associationAt: #WordNet30)
synsetsAt: surface
"(WordRootNet synsetsAt: 'surface') explore"! !


!Object methodsFor: '*Mindlog-monkeypatches' stamp: 'kwl 11/13/2008 14:34'!
assert: boolBlock description: stringBlock
"Throw an assertion error if boolBlock does not evaluates to true."

"Enh: conditionally evaluate stringBlock, defaults to Object>>#value "
boolBlock value ifFalse: [ AssertionFailure signal: stringBlock value ]! !


!TestCase methodsFor: '*Mindlog-monkeypatches' stamp: 'kwl 11/20/2008
09:22'!
assert: aBooleanOrBlock description: aStringOrBlock
| aString |
aBooleanOrBlock value ifFalse: [
"Enh: conditionally evaluate stringBlock, defaults to Object>>#value "
aString := aStringOrBlock value.
self logFailure: aString.
TestResult failure signal: aString]
! !

WordNetsSynset initialize!


\ No newline at end of file
+'From Squeak3.10.2 of ''5 June 2008'' [latest update: #7179] on 20
December 2008 at 10:15:48 am'!
"Change Set: Mindlog-WordNet
Date: 20 December 2008
Author: Klaus D. Witzel

<project home: http://code.google.com/p/mindlog/
code license: http://www.opensource.org/licenses/mit-license.php
content license: http://creativecommons.org/licenses/by-sa/3.0/
developed on platform: http://www.squeak.org/Download/
feedback and issues: http://code.google.com/p/mindlog/issues/list
(test line for User Defect Report in
http://code.google.com/p/support/issues/entry)
project discussion: http://groups.google.com/group/mindlog-dev/topics

installation and use:

1 download WNprolog-3.0.tar.gz from http://wordnet.princeton.edu/obtain
2 untar the downloaded into a directory, gives files with .pl (Prolog)
ending
3 fileIn this change set, it will load the .pl files (1-2 minutes)
4 during load it prints # of items found, per file, in the Transcript
5 with SUnit run WordNetTests (method #testIntegrity takes ca. 30 seconds)
6 you can now access synsets by lemma,
7 see method #synsetsAt: in the public interface
>"!

LookupKey variableSubclass: #WordNetLookupKey
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetLookupKey commentStamp: 'kwl 11/13/2008 11:26' prior: 0!
Instances of class WordNetLookupKey represent external keys of WordNet's
semantic relations and lemma's senses.

My 'key' is WordNet's lemma at its surface symbol, my indexable fields
store subinstances of WordNetsSynset.!

WordNetLookupKey variableSubclass: #WordNetsSynset
instanceVariableNames: 'arcs'
classVariableNames: 'AdjectiveIndex AdverbIndex AntonymIndex
AttributeIndex CauseIndex ClauseIndex ClusterIndex ColligationIndex
DerivationIndex EntailIndex InstanceIndex KindIndex MaskSynsetBits
MaskSynsetShift MemberIndex NounIndex PartIndex PertainsIndex RegionIndex
RuleOutIndex ScriptIndex SubstanceIndex TopicIndex UsageIndex VerbIndex'
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetsSynset commentStamp: 'kwl 11/20/2008 08:43' prior: 0!
Subinstances of class WordNetsSynset represent WordNet lemmas' senses and
their semantic relations.

Their 'key' field is WordNet's synset_id (syntactic category appended),
their 'arcs' field represents semantic relations (a collection of two-word
bit-vectors indexing my subinstances and their lemmas).

The indexable fields store the lemma's surface symbol, indexing is by
WordNet's lemma (aka word) number.

Break-down of the two-word bit-vector which represents a relation (viewed
as an arc in a graph):

48:8 here/this word's number/index
40:8 there/that word's number/index
31:8 relation's index number
23:23 synset index number (syntactic category appended)

During initial load of WordNet's lex db, their synset_id's are renumbered
and their syntactic category is appended.

The two synset numers are associated at the application level, for example
the index for the think_of verb is associated with WordNet3.0's 200723222
synset_id. Associations in use are

105764197 colligation
105820620 example, illustration, instance, representative
105849040 property, attribute, dimension
106288024 antonym, opposite_word, opposite
106290051 derivation
106292478 holonym, whole_name
106292836 hypernym, superordinate, superordinate_word
106292973 hyponym, subordinate, subordinate_word
106293746 meronym, part_name
106303682 synonym, equivalent_word
106304425 troponym, manner_name
106314144 clause
106317862 noun
106318062 verb
106319029 adjective
106319157 adverb
106322357 pertainym
107959943 bunch, clump, cluster, clustering
107997703 class. category. family
200723222 think_of
201147562 rule_out, rule_in
201756719 script
202634808 entail, implicate
!

WordNetsSynset variableSubclass: #WordNetAdjectiveSynset
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetAdjectiveSynset commentStamp: 'kwl 11/13/2008 11:25' prior: 0!
See description in my superclass.!

WordNetsSynset variableSubclass: #WordNetAdverbSynset
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetAdverbSynset commentStamp: 'kwl 11/13/2008 11:25' prior: 0!
See description in my superclass.!

WordNetsSynset variableSubclass: #WordNetNounSynset
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetNounSynset commentStamp: 'kwl 11/13/2008 11:24' prior: 0!
See description in my superclass.!

WordNetsSynset variableSubclass: #WordNetVerbSynset
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordNetVerbSynset commentStamp: 'kwl 11/13/2008 11:24' prior: 0!
See description in my superclass.!

WordNetsSynset variableSubclass: #WordRootNet
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNet3v0'!

!WordRootNet commentStamp: 'kwl 11/20/2008 09:31' prior: 0!
An instance of class WordRootNet holds the root of WordNet's hypergraph.

Indexable fields hold dictionaries for accessing nouns, verbs, adjectives
and adverbs by synset_id.

Field 'arcs' holds the binary searchable collection of synsets, with one
entry per lemma (word) of the hypergraph.

A supplemental indexed field holds the collection of primitive realtions
(think_of, kind_of, opposite_of, etc).
!

!WordNetLookupKey methodsFor: 'accessing' stamp: 'kwl 11/13/2008 15:22'!
adjectives
"answer a collection of the receiver's adjectives"
| answer synset |
answer := OrderedCollection new: self basicSize.
1 to: self basicSize do: [:wNum |
((synset := self basicAt: wNum) isMemberOf: WordNetAdjectiveSynset)
ifTrue: [answer add: synset]
].
^ answer! !

!WordNetLookupKey methodsFor: 'accessing' stamp: 'kwl 11/13/2008 15:22'!
adverbs
"answer a collection of the receiver's adverbs"
| answer synset |
answer := OrderedCollection new: self basicSize.
1 to: self basicSize do: [:wNum |
((synset := self basicAt: wNum) isMemberOf: WordNetAdverbSynset)
ifTrue: [answer add: synset]
].
^ answer! !

!WordNetLookupKey methodsFor: 'accessing' stamp: 'kwl 11/13/2008 15:23'!
nouns
"answer a collection of the receiver's nouns"
| answer synset |
answer := OrderedCollection new: self basicSize.
1 to: self basicSize do: [:wNum |
((synset := self basicAt: wNum) isMemberOf: WordNetNounSynset)
ifTrue: [answer add: synset]
].
^ answer! !

!WordNetLookupKey methodsFor: 'accessing' stamp: 'kwl 11/13/2008 15:21'!
verbs
"answer a collection of the receiver's verbs"
| answer synset |
answer := OrderedCollection new: self basicSize.
1 to: self basicSize do: [:wNum |
((synset := self basicAt: wNum) isMemberOf: WordNetVerbSynset)
ifTrue: [answer add: synset]
].
^ answer! !

!WordNetLookupKey methodsFor: 'private' stamp: 'kwl 11/14/2008 13:39'!
key: anObject
1 to: self basicSize do: [:wNum |
(self basicAt: wNum) ifNotNilDo: [:you | you shareKey: anObject]].
^ super key: anObject! !

TestCase subclass: #WordNetTests
instanceVariableNames: 'wordNet'
classVariableNames: ''
poolDictionaries: ''
category: 'Mindlog-WordNetTests'!

!WordNetTests commentStamp: 'kwl 11/19/2008 08:20' prior: 0!
Tests for the integrity of the WordNet db and the public interface.!


!WordNetTests methodsFor: 'running' stamp: 'kwl 11/19/2008 08:09'!
setUp
wordNet := nil environment associationAt: #WordNet30! !

!WordNetTests methodsFor: 'running' stamp: 'kwl 11/19/2008 08:09'!
tearDown
wordNet := nil! !

!WordNetTests methodsFor: 'testing - integrity' stamp: 'kwl 11/19/2008
20:56'!
testIntegrity
| relationsIx inverseYz here other count |
(1 to: wordNet basicSize -1) with: #('noun' 'verb' 'adjective' 'adverb')
do: [:pos :mfc |
'testing integrity of ', mfc, ' relations'
displayProgressAt: Sensor cursorPoint from: (count := 0) to: (other :=
wordNet basicAt: pos) size during: [:bar |
other associationsDo: [:synset |
bar value: (count := count +1).
inverseYz := (relationsIx := synset relationIndices) collect: [:each |
wordNet inverseRelIdOf: each].
here := synset key.
relationsIx with: inverseYz do: [:iX :jY |
(synset neighboursAt: iX) do: [:neighbour |
other := wordNet senseAt: neighbour.
self shouldnt: [other indexOfRelation: jY of: here]
raise: Error
]
]
]
]
]! !


!WordNetsSynset methodsFor: 'accessing' stamp: 'kwl 11/12/2008 16:36'!
partOfSpeech
"answer the receiver's part of speech"
^ #'think_of'! !

!WordNetsSynset methodsFor: 'accessing-relations' stamp: 'kwl 11/20/2008
08:14'!
indexOfRelation: relId of: synsetId
"answer the index of my arc {wNum,relId,wNum,synsetId}
or raise an error if no such relation"
| secondVector |
arcs ifNil: [^ self error: 'relation not found'].
secondVector := (relId bitShift: MaskSynsetShift) + synsetId.
2 to: arcs size by: 2 do: [:index | ((arcs at: index) = secondVector)
ifTrue: [^ index ]].
^ self error: 'relation not found'! !

!WordNetsSynset methodsFor: 'accessing-relations' stamp: 'kwl 11/20/2008
08:19'!
inverseRelIdOf: relId
"answer the index of the inverse relation to
the argument, or the argument if idempotent"
| implemented relation inverse answer |
relation := (implemented := self basicAt: self basicSize) at: relId.
answer := relId even ifTrue: [relId -1] ifFalse: [relId +1].
inverse := implemented at: answer.
1 to: relation basicSize do: [:wNum |
(inverse basicAt: wNum) = (relation basicAt: wNum)
ifFalse: [^ answer]
].
^ relId! !

!WordNetsSynset methodsFor: 'accessing-relations' stamp: 'kwl 11/20/2008
08:24'!
neighboursAt: relId
"answer a collection of the receiver's neighbours id's under relId"
| answer negShift here |
arcs ifNil: [^ #() ].
answer := OrderedCollection new.
negShift := 0 - MaskSynsetShift.
2 to: arcs size by: 2 do: [:index |
((here := arcs at: index) bitShift: negShift) = relId
ifTrue: [answer add: (here bitAnd: MaskSynsetBits)].
].
^ answer! !

!WordNetsSynset methodsFor: 'accessing-relations' stamp: 'kwl 11/20/2008
08:33'!
relationIndices
"answer a collection of the receiver's relation indices"
| answer negShift |
arcs ifNil: [^ #() ].
answer := IdentitySet new.
negShift := 0 - MaskSynsetShift.
2 to: arcs size by: 2 do: [:index |
answer add: ((arcs at: index) bitShift: negShift).
].
^ answer asArray
! !

!WordNetsSynset methodsFor: 'printing' stamp: 'kwl 11/16/2008 15:14'!
printOn: aStream

aStream print: key; space; nextPutAll: self partOfSpeech; space; nextPut:
${.
1 to: self basicSize -1 do: [:ix | aStream print: (self basicAt: ix);
space].
aStream print: (self basicAt: self basicSize); nextPut: $}! !

!WordNetsSynset methodsFor: 'private' stamp: 'kwl 11/20/2008 08:02'!
at: wNumX of: relIdX put: wNumY of: senseIdY
"this is part of class initialization"
| firstVector secondVector wordArray |
arcs isInteger ifTrue: [WordArray streamContents: [:arcStream |
arcStream nextPut: arcs.
arcs := arcStream]
].
firstVector := (wNumX bitShift: 8) + wNumY.
self assert: [((secondVector := (relIdX bitShift: MaskSynsetShift) +
senseIdY) bitAnd: MaskSynsetBits) = senseIdY]
description: ['cannot compact 2nd vector'].
wordArray := arcs braceArray.
3 to: arcs position by: 2 do: [:index | ((wordArray at: index) =
secondVector)
ifTrue: ["Princetonian duplicate" ^ self]].
arcs
nextPut: firstVector;
nextPut: secondVector! !

!WordNetsSynset methodsFor: 'private' stamp: 'kwl 11/19/2008 20:33'!
clampArcStreams
arcs ifNil: [^ self].
arcs isStream ifTrue: [
key := arcs braceArray first.
^ arcs := arcs braceArray copyFrom: 2 to: arcs position].
arcs isInteger ifTrue: [key := arcs. ^ arcs := nil].
1 to: self basicSize -1 do: [:pos |
(self basicAt: pos)
valuesDo: [:synset | synset clampArcStreams];
rehash
]! !

!WordNetsSynset methodsFor: 'private' stamp: 'kwl 11/13/2008 11:47'!
key: aKey arcs: anObject
arcs := anObject.
^ super key: aKey! !

!WordNetsSynset methodsFor: 'private' stamp: 'kwl 11/19/2008 20:12'!
newKey
"this is part of class initialization"
^ arcs isInteger
ifTrue: [arcs]
ifFalse: [arcs braceArray first]! !

!WordNetsSynset methodsFor: 'private' stamp: 'kwl 11/12/2008 21:15'!
shareKey: anObject
1 to: self basicSize do: [:wNum |
(self basicAt: wNum) = anObject
ifTrue: [^ self basicAt: wNum put: anObject]
]! !


!WordNetAdjectiveSynset methodsFor: 'accessing' stamp: 'kwl 11/16/2008
15:11'!
partOfSpeech
"answer the receiver's part of speech"
^ #adjective! !


!WordNetAdverbSynset methodsFor: 'accessing' stamp: 'kwl 11/16/2008 15:12'!
partOfSpeech
"answer the receiver's part of speech"
^ #adverb! !


!WordNetNounSynset methodsFor: 'accessing' stamp: 'kwl 11/16/2008 15:12'!
partOfSpeech
"answer the receiver's part of speech"
^ #noun! !


!WordNetVerbSynset methodsFor: 'accessing' stamp: 'kwl 11/16/2008 15:12'!
partOfSpeech
"answer the receiver's part of speech"
^ #verb! !


!WordNetsSynset class methodsFor: 'class initialization' stamp: 'kwl
11/19/2008 20:35'!
initialize
"WordNetsSynset initialize"
| lexEntries slash localDirectory fileStream fileName doSkip temp lexDict |

MaskSynsetBits := (1 bitShift: (MaskSynsetShift := 23)) -1.

lexEntries := {IdentityDictionary new. IdentityDictionary new.
IdentityDictionary new. IdentityDictionary new}.
slash := FileDirectory default pathNameDelimiter asString.
localDirectory := '..' , slash, 'WNprolog3.0' , slash.
fileStream := FileDirectory default readOnlyFileNamed: localDirectory,
(fileName := 'wn_sk.pl').
localDirectory := '.' , slash.
doSkip := [fileStream upTo: $.. [(temp := fileStream peek) notNil and:
[temp isSeparator]]
whileTrue: [fileStream next]].
temp := localDirectory, fileStream directory localName, slash, fileName.
'sense keys ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries senseKeys: fileStream progress: bar with:
doSkip.
].
lexDict := (temp := Array new: (lexEntries inject: 1 into: [:sum :dict |
sum + dict size]) * 3 >> 1) writeStream.
'lexical dictionary ' displayProgressAt: Sensor cursorPoint
from: 0 to: temp size
during: [:bar |
self initialize: lexEntries dictionary: lexDict progress: bar.
].
lexEntries := lexEntries, {OrderedCollection new. lexDict contents}.
self initializeMore: lexEntries.
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_ant.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'antonyms ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries antonyms: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_at.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'attributes ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries attributes: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_cls.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'categories ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries categories: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_cs.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'causes ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries causes: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_der.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'derivationals ', localDirectory,fileName displayProgressAt: Sensor
cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries derivationals: fileStream progress: bar
with: doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_ent.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'entailments ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries entailments: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_hyp.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'hypernyms ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries hypernyms: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_ins.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'instances ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries instances: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_mm.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'members ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries members: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_mp.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'parts ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries parts: fileStream progress: bar with: doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_per.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'pertainyms ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries pertainyms: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_sim.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'similarities ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries similarities: fileStream progress: bar with:
doSkip.
].
fileStream := fileStream directory readOnlyFileNamed:
(fileName := 'wn_ms.pl').
temp := localDirectory, fileStream directory localName, slash, fileName.
'substances ', temp displayProgressAt: Sensor cursorPoint
from: 0 to: fileStream size
during: [:bar |
self initialize: lexEntries substances: fileStream progress: bar with:
doSkip.
].
nil environment removeKey: (temp := #WordNet30) ifAbsent: nil.
temp := nil environment add: ((WordRootNet basicNew: lexEntries size -1)
key: temp arcs: lexEntries last).
lexEntries allButLast withIndexDo: [:dict :index | temp basicAt: index
put: dict].
temp clampArcStreams
! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/13/2008 14:27'!
adapt: synsetClone at: pos from: fromWord to: toWord
synsetClone assert: [(synsetClone basicAt: pos) = fromWord]
description: ['could not find the ', fromWord, ' synset #', synsetClone
key printString].
^ synsetClone basicAt: pos put: toWord; yourself! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:13'!
initialize: lexEntries antonyms: fileStream progress: indicator with: doSkip
| posX senseX wNumX posY senseY wNumY synsetX synsetY count |
AntonymIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6288024) clone at: 1
from: 'antonym' to: 'opposite_of');
add: (self adapt: (lexEntries first at: 6288024) clone at: 1
from: 'antonym' to: 'opposite_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
wNumX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $,) readStream.
wNumY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: wNumX of: AntonymIndex put: wNumY of: synsetY newKey.
synsetY at: wNumY of: AntonymIndex put: wNumX of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:13'!
initialize: lexEntries attributes: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY whichIndex count |
AttributeIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 5849040) clone at: 2
from: 'attribute' to: 'attribute_to');
add: (self adapt: (lexEntries first at: 5849040) clone at: 2
from: 'attribute' to: 'attribute_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
(synsetY isMemberOf: WordNetAdjectiveSynset)
ifTrue: [whichIndex := AttributeIndex -1]
ifFalse: [whichIndex := AttributeIndex].
synsetX at: 0 of: whichIndex put: 0 of: synsetY newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries categories: fileStream progress: indicator with:
doSkip
| posX senseX wNumX posY senseY wNumY categoryIndices synsetX synsetY
whichRelation count |
TopicIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'topic_to');
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'topic_of'); size.
UsageIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'usage_to');
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'usage_of'); size.
RegionIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'region_to');
add: (self adapt: (lexEntries first at: 7997703) clone at: 2
from: 'category' to: 'region_of'); size.
categoryIndices := {TopicIndex. UsageIndex. RegionIndex}.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
wNumX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $,) readStream.
wNumY := Integer readFrom: (fileStream upTo: $,) readStream.
synsetY := (lexEntries at: posY) at: senseY.
whichRelation := categoryIndices at: (#($t $u $r) indexOf: fileStream
peek).
synsetX at: wNumX of: whichRelation put: wNumY of: synsetY newKey.
synsetY at: wNumY of: whichRelation -1 put: wNumX of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries causes: fileStream progress: indicator with: doSkip
| posX senseX posY senseY synsetX synsetY count |
CauseIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 7326557) clone at: 1
from: 'cause' to: 'cause_to');
add: (self adapt: (lexEntries first at: 7326557) clone at: 1
from: 'cause' to: 'cause_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: CauseIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: CauseIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries derivationals: fileStream progress: indicator with:
doSkip
| posX senseX wNumX posY senseY wNumY synsetX synsetY count |
DerivationIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 13462387) clone at: 1
from: 'derivation' to: 'derivation_to');
add: (self adapt: (lexEntries first at: 13462387) clone at: 1
from: 'derivation' to: 'derivation_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
wNumX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $,) readStream.
wNumY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: wNumX of: DerivationIndex put: wNumY of: synsetY newKey.
synsetY at: wNumY of: DerivationIndex -1 put: wNumX of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/14/2008 14:49'!
initialize: lexEntries dictionary: lexDict progress: bar
"WordNetsSynset initialize"
| posHint senseHint count wordStream surface index temp wordSet |
count := 0.
lexEntries withIndexDo: [:dict :pos |
posHint := #('n' 'v' 'j' 'w') at: pos.
dict keysAndValuesDo: [:senseId :synset |
senseHint := senseId printString.
1 to: synset basicSize do: [:wNum |
lexDict nextPut: (synset basicAt: wNum), ' ', posHint, senseHint
]
].
bar value: (count := count + dict size).
].
wordStream := lexDict contents asSortedCollection.
bar value: (count := 0).
surface := nil.
lexDict reset.
wordStream do: [:sortKey |
index := sortKey size.
[(sortKey at: (index := index -1)) isDigit] whileTrue.
posHint := #($n $v $j $w) indexOf: (sortKey at: index).
senseHint := Integer readFrom: (sortKey readStream position: index;
yourself).
temp := sortKey copyFrom: 1 to: index -2.
surface
ifNil: [surface := temp. wordStream := (Array new: 31) writeStream].
temp = surface
ifFalse: [wordSet := superclass basicNew: (index := wordStream contents)
size.
index withIndexDo: [:synset :wNum | wordSet basicAt: wNum put: synset].
lexDict nextPut: (wordSet key: surface; yourself).
bar value: (count := count + 1).
surface := temp. wordStream reset
].
wordStream nextPut: ((lexEntries at: posHint) at: senseHint)
].
wordSet := superclass basicNew: (index := wordStream contents) size.
index withIndexDo: [:synset :wNum | wordSet basicAt: wNum put: synset].
lexDict nextPut: (wordSet key: surface; yourself).
bar value: (count := count + 1).
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries entailments: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY count |
EntailIndex := lexEntries fifth
add: (self adapt: (lexEntries second at: 2634808) clone at: 1
from: 'entail' to: 'entail_to');
add: (self adapt: (lexEntries second at: 2634808) clone at: 1
from: 'entail' to: 'entail_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: EntailIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: EntailIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries hypernyms: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY count |
KindIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6292836) clone at: 1
from: 'hypernym' to: 'kind_to');
add: (self adapt: (lexEntries first at: 6292836) clone at: 1
from: 'hypernym' to: 'kind_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: KindIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: KindIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries instances: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY count |
InstanceIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 5820620) clone at: 3
from: 'instance' to: 'instance_to');
add: (self adapt: (lexEntries first at: 5820620) clone at: 3
from: 'instance' to: 'instance_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: InstanceIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: InstanceIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:14'!
initialize: lexEntries members: fileStream progress: indicator with: doSkip
| posX senseX posY senseY synsetX synsetY count |
MemberIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'member_to');
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'member_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: MemberIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: MemberIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:15'!
initialize: lexEntries parts: fileStream progress: indicator with: doSkip
| posX senseX posY senseY synsetX synsetY count |
PartIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'part_to');
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'part_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: PartIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: PartIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:15'!
initialize: lexEntries pertainyms: fileStream progress: indicator with:
doSkip
| posX senseX wNumX posY senseY wNumY synsetX synsetY count |
PertainsIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6322357) clone at: 1
from: 'pertainym' to: 'pertains_to');
add: (self adapt: (lexEntries first at: 6322357) clone at: 1
from: 'pertainym' to: 'pertains_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
wNumX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $,) readStream.
wNumY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: wNumX of: PertainsIndex put: wNumY of: synsetY newKey.
synsetY at: wNumY of: PertainsIndex -1 put: wNumX of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 19:57'!
initialize: lexEntries senseKeys: fileStream progress: indicator with:
doSkip
"WordNetsSynset initialize"
| temp posK senseK wNumK surface other newIndex |
[fileStream atEnd]
whileFalse:
[posK := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseK := Integer readFrom: (fileStream upTo: $,) readStream.
wNumK := Integer readFrom: (fileStream upTo: $,) readStream.
surface := (String readFrom: (fileStream upTo: $)) readStream) readStream
upTo: $%.
temp := lexEntries at: posK.
posK := temp at: senseK ifAbsent: [temp at: senseK put: OrderedCollection
new].
[posK size < wNumK] whileTrue: [posK addLast: wNumK].
posK at: wNumK put: surface.
doSkip value.
indicator value: fileStream position].
fileStream close.
temp := lexEntries inject: 1 into: [:size :dict | size + dict size].
other := (Array new: temp) writeStream.
surface := (Array new: temp) writeStream.
wNumK := {WordNetNounSynset. WordNetVerbSynset. WordNetAdjectiveSynset.
WordNetAdverbSynset}.
newIndex := (posK := 0).
lexEntries with: wNumK do: [:dict :subclass |
posK := posK +1.
dict associationsDo: [:assoc |
temp := subclass basicNew: (senseK := assoc value) size.
temp key: assoc key arcs: ((newIndex := newIndex +1) bitShift: 3) + posK.
senseK withIndexDo: [:word :ix | temp basicAt: ix put: word].
other nextPut: assoc.
surface nextPut: temp.
]].
other contents elementsExchangeIdentityWith: surface contents.
lexEntries do: [:dict | dict rehash]! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:15'!
initialize: lexEntries similarities: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY count |
ClusterIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6322357) clone at: 1
from: 'pertainym' to: 'cluster_to');
add: (self adapt: (lexEntries first at: 6322357) clone at: 1
from: 'pertainym' to: 'cluster_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: ClusterIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: ClusterIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 20:15'!
initialize: lexEntries substances: fileStream progress: indicator with:
doSkip
| posX senseX posY senseY synsetX synsetY count |
SubstanceIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'substance_to');
add: (self adapt: (lexEntries first at: 6293746) clone at: 1
from: 'meronym' to: 'substance_of'); size.
count := 0.
[fileStream atEnd]
whileFalse:
[posX := (fileStream upTo: $(; next) codePoint - $0 codePoint.
senseX := Integer readFrom: (fileStream upTo: $,) readStream.
synsetX := (lexEntries at: posX) at: senseX.
posY := (fileStream next) codePoint - $0 codePoint.
senseY := Integer readFrom: (fileStream upTo: $)) readStream.
synsetY := (lexEntries at: posY) at: senseY.
synsetX at: 0 of: SubstanceIndex put: 0 of: synsetY newKey.
synsetY at: 0 of: SubstanceIndex -1 put: 0 of: synsetX newKey.
doSkip value.
indicator value: fileStream position.
count := 1 + count].
fileStream close.
Transcript space; show: count printString! !

!WordNetsSynset class methodsFor: 'private' stamp: 'kwl 11/19/2008 06:47'!
initializeMore: lexEntries
RuleOutIndex := lexEntries fifth
add: (self adapt: (lexEntries second at: 1147562) clone at: 1
from: 'rule_out' to: 'rule_out');
add: (self adapt: (lexEntries second at: 1147562) clone at: 2
from: 'rule_in' to: 'rule_in'); size.
NounIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6317862) clone at: 1 from: 'noun'
to: 'non-noun');
add: (self adapt: (lexEntries first at: 6317862) clone at: 1 from: 'noun'
to: 'noun'); size.
VerbIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6318062) clone at: 1 from: 'verb'
to: 'non-verb');
add: (self adapt: (lexEntries first at: 6318062) clone at: 1 from: 'verb'
to: 'verb'); size.
AdjectiveIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6319029) clone at: 1
from: 'adjective' to: 'non-adjective');
add: (self adapt: (lexEntries first at: 6319029) clone at: 1
from: 'adjective' to: 'adjective'); size.
AdverbIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6319157) clone at: 1
from: 'adverb' to: 'non-adverb');
add: (self adapt: (lexEntries first at: 6319157) clone at: 1
from: 'adverb' to: 'adverb'); size.
ColligationIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 5764197) clone at: 1
from: 'colligation' to: 'colligation_to');
add: (self adapt: (lexEntries first at: 5764197) clone at: 1
from: 'colligation' to: 'colligation_of'); size.
ClauseIndex := lexEntries fifth
add: (self adapt: (lexEntries first at: 6314144) clone at: 1
from: 'clause' to: 'clause_to');
add: (self adapt: (lexEntries first at: 6314144) clone at: 1
from: 'clause' to: 'clause_of'); size.
ScriptIndex := lexEntries fifth
add: (self adapt: (lexEntries second at: 1756719) clone at: 1
from: 'script' to: 'script_to');
add: (self adapt: (lexEntries second at: 1756719) clone at: 1
from: 'script' to: 'script_of'); size! !


!WordRootNet methodsFor: 'accessing' stamp: 'kwl 11/19/2008 20:45'!
senseAt: wNumSynsetId
"given the argument answer a synset"
| synsetId |
synsetId := wNumSynsetId bitAnd: MaskSynsetBits.
^ (self basicAt: (synsetId bitAnd: 7)) at: synsetId! !

!WordRootNet methodsFor: 'accessing' stamp: 'kwl 11/14/2008 14:39'!
synsetsAt: surface
"answer synsets which contain the argument, a lemma"
| asciiOrder |
asciiOrder := String classPool at: #AsciiOrder.
^ arcs findBinary: [:synsets |
2 - (surface class compare: synsets key with: surface collated:
asciiOrder)] ifNone: nil! !


!WordRootNet class methodsFor: 'public interface' stamp: 'kwl 11/14/2008
15:42'!
synsetsAt: surface
"answer synsets which contain the argument, a lemma"
^ (nil environment associationAt: #WordNet30)
synsetsAt: surface
"(WordRootNet synsetsAt: 'surface') explore"! !


!Object methodsFor: '*Mindlog-monkeypatches' stamp: 'kwl 11/13/2008 14:34'!
assert: boolBlock description: stringBlock
"Throw an assertion error if boolBlock does not evaluates to true."

"Enh: conditionally evaluate stringBlock, defaults to Object>>#value "
boolBlock value ifFalse: [ AssertionFailure signal: stringBlock value ]! !


!TestCase methodsFor: '*Mindlog-monkeypatches' stamp: 'kwl 11/20/2008
09:22'!
assert: aBooleanOrBlock description: aStringOrBlock
| aString |
aBooleanOrBlock value ifFalse: [
"Enh: conditionally evaluate stringBlock, defaults to Object>>#value "
aString := aStringOrBlock value.
self logFailure: aString.
TestResult failure signal: aString]
! !

WordNetsSynset initialize!


\ No newline at end of file

Reply all
Reply to author
Forward
0 new messages