Name: | FirstConstD_eng_V3 |
Author: | Erwin R. Komen & Rosanne Hebing |
Goal: | Find first constituents in English corpora that contain a "d-word" |
Comments: | Find all first constituents in clauses: (1) labeled as SMAIN (2) having a subject (identified by $_subject, excluding $_nosubject) (3) having a finite verb (identified by $_finiteverb) (4) not containing a subclause (IP-SUB etc) Compare the first constituents as above with those that have: (5) a descendant word with a POS classifying it as D-word (6) Question: is this D-word allowed to be pre-finite-verb, but following the preverbal subject?? (Are there such situations anyway?) Notes: (7) Exclude from the "th" adverbs: "forthwith" (8) Do include ADV+ instances like ADV+P (therefore) History: 30/8/2011 ERK Derived from FirstConstD_cgn_V4 06/9/2011 ERK Look separately at "real" clause-initial first constituents with d-words 08/9/2011 ERK Differentiate between subject/non-subject constituents with d-words |
Last change: | 30-8-2011 (created: donderdag 8 september 2011 13:26) |
Project type: | Xquery-psdx |
Queries: | D:\Data files\Corpora\CorpusStudio\Xq |
Output: | D:\Data files\Corpora\CorpusStudio\FirstConstituent\Xq |
Sources: | D:\Data files\Corpora\English\xml\Adapted\*.psdx |
Period Info: | D:\Data Files\Corpora\CorpusStudio\Query\EnglishPeriods.xml(changed:dinsdag 16 augustus 2011 14:29) |
Parameters: | Prec=3 Foll=1 |
Line | Input | Query | Output | Result | Cmp | Exmp | Goal |
1 | Source | matX-Vfin | matX-Vfin | matX-Vfin | - | + | Find the first constituent in main clauses |
2 | 1/out | matDvnw-Vfin | matDvnw-Vfin | matDvnw-Vfin | + | + | Find the first constituent in main clauses containing a D-word that is classified as pronoun (vnw) |
3 | 2/out | mat-i-Dvnw-Vfin | mat-i-Dvnw-Vfin | mat-i-Dvnw-Vfin | + | + | Find the "immediate" first constituent in main clauses containing a D-word that is classified as pronoun (vnw) |
4 | 3/out | mat-i-DvnwSbj-Vfin | mat-i-DvnwSbj-Vfin | mat-i-DvnwSbj-Vfin | + | + | Find the "immediate" subject first constituent in main clauses containing a D-word that is classified as pronoun (vnw) This first constituent is the subject |
5 | 4/cmp | mat-i-Dvnw-Vfin | mat-i-DvnwNonSbj-Vfin | mat-i-DvnwNonSbj-Vfin | - | + | Find the "immediate" first constituent in main clauses containing a D-word that is classified as pronoun (vnw) This first constituent is NOT a subject |
6 | 3/cmp | matDvnw-Vfin | mat-non-i-Dvnw-Vfin | mat-non-i-Dvnw-Vfin | - | + | Find the first constituent in main clauses containing a D-word that is classified as pronoun (vnw) |
7 | 2/cmp | matDadv-Vfin | matDadv-Vfin | matDadv-Vfin | + | + | Find the first constituent in main clauses containing a D-word that is classified as adverb (bw) |
8 | 7/out | matSbj-Dadv-Vfin | matSbj-Dadv-Vfin | matSbj-Dadv-Vfin | + | + | Find post-subject pre-finite-verb constituents in main clauses containing a D-word that is classified as adverb (bw) |
9 | 8/cmp | mat-i-Dadv-Vfin | mat-i-Dadv-Vfin | mat-i-Dadv-Vfin | + | + | Find the "really" first constituent in main clauses containing a D-word that is classified as adverb (bw) |
10 | 9/cmp | matDadv-Vfin | mat-non-i-Dadv-Vfin | mat-non-i-Dadv-Vfin | - | + | Find the first constituent in main clauses containing a D-word that is classified as adverb (bw) |
11 | 7/cmp | matX-Vfin | matOther-Vfin | matOther-Vfin | - | + | Find the first constituent in main clauses |
12 | Source | anyDvnw | anyDvnw | anyDvnw | - | + | Find D-words anywhere, no matter where! |
13 | Source | anyDadv | anyDadv | anyDadv | - | + | Find D-adverbs anywhere, no matter where! |
$_matrixIP | "IP-MAT*"; |
$_anyIP | "IP|IP-*"; |
$_anyadv | "ADV|ADV-*|ADV+*|ADV^*|ADV#*"; |
$_finiteverb | "BEI|BEP*|BED*|UTP|*HVI|*HVP*|*HVD*|*AXI|*AXP*|*AXD*|*MD|VBI|*VBP*|*VBD*|*DOI|*DOP*|*DOD*|NEG+BEI|NEG+BEP*|NEG+BED*|NEG+AXI|NEG+*AXP*|NEG+*AXD*|NEG+*MD|NEG+VBI|NEG+*VBP*|NEG+*VBD"; |
$_anyth | "th*|Th*|+t*|+T*|+d*|+D*|forth*|forTh*|for+t*|for+T*|for+d*|for+D*"; |
$_anyfalseth | "forth|forthw*|for+d|for+der"; |
$_nosubject | "*PRD*|*LFD*|*VOC*|*MSR*"; |
$_anynp | "NP|NP-*"; |
$_ignore_nodes_conj | concat($_ignore_nodes, "|CONJ*"); |
$np | $all[tb:Like(@Label, $_anynp)] |
File: | D:\Data files\Corpora\CorpusStudio\Xq\matX-Vfin.xq |
Goal: | Find the first constituent in main clauses |
Comment: | Select and show the first constituent. Conditions: (1) Main clause (2) It must have a subject (3) It must have a finite verb |
Changed: | donderdag 8 september 2011 16:16 (created: donderdag 25 augustus 2011 10:10) |
Query: | { for $search in //eTree[tb:HasLabel(@Label, $_matrixIP)] (: Only take into account clauses with a licit subject :) let $sbjAll := tb:SomeChildNo($search, $_subject, $_nosubject) (: Exclude starred elements and take the first one remaining :) let $sbj := $sbjAll[not(child::eLeaf/@Type = 'Star')][1] (: Get the finite verb :) let $Vfin := ru:one($search, 'FirstChild', $_finiteverb) (: Get the preverbal realm: the constituents preceding $Vfin :) (: Make sure they are not of the "ignore" type and they are not a sub clause :) let $preV := $Vfin/preceding-sibling::eTree[not(tb:Like(@Label, $_ignore_nodes_conj)) and not(some $dsc in descendant-or-self::eTree satisfies (tb:Like($dsc/@Label, $_anyIP)) ) ] (: Prepare a message to be shown in the output :) let $msg := concat('First const = [', tb:Tree($preV[1]), ']') (: Stipulate our limitations here :) where ( exists($Vfin) and exists($sbj) and exists($preV) ) (: Output providing a message in $msg :) return tb:MyForestMsg($search, $msg) } |
File: | D:\Data files\Corpora\CorpusStudio\Xq\matDvnw-Vfin.xq |
Goal: | Find the first constituent in main clauses containing a D-word that is classified as pronoun (vnw) |
Comment: | Select and show the first constituents with a demonstrative. Conditions: (1) Main clause (2) It must have a subject (3) It must have a finite verb (4) Demonstrative is specified by the NPtype |
Changed: | donderdag 8 september 2011 16:16 (created: dinsdag 30 augustus 2011 12:32) |
Query: | { for $search in //eTree[tb:HasLabel(@Label, $_matrixIP)] (: Only take into account clauses with a licit subject :) let $sbjAll := tb:SomeChildNo($search, $_subject, $_nosubject) (: Exclude starred elements and take the first one remaining :) let $sbj := $sbjAll[not(child::eLeaf/@Type = 'Star')][1] (: Get the finite verb :) let $Vfin := ru:one($search, 'FirstChild', $_finiteverb) (: Get the preverbal realm: the constituents preceding $Vfin :) (: Make sure they are not of the "ignore" type and they are not a sub clause :) let $preV := $Vfin/preceding-sibling::eTree[not(tb:Like(@Label, $_ignore_nodes_conj)) and not(some $dsc in descendant-or-self::eTree satisfies (tb:Like($dsc/@Label, $_anyIP)) )] (: Get those preverbal constituents, that are of type Dem or DemNP :) let $preVd := for $dsc in $preV/descendant-or-self::eTree where tb:Like(tb:Feature($dsc, 'NPtype'), 'Dem|DemNP') return $dsc (: Prepare a message to be shown in the output :) let $msg := concat('First const d-vnw = [', tb:Syntax($preVd[1]), ']') (: Stipulate our limitations here :) where ( exists($Vfin) and exists($sbj) and exists($preV) and exists($preVd) ) (: Output providing a message in $msg :) return tb:MyForestMsg($search, $msg) } |
File: | D:\Data files\Corpora\CorpusStudio\Xq\matDadv-Vfin.xq |
Goal: | Find the first constituent in main clauses containing a D-word that is classified as adverb (bw) |
Comment: | Select and show the first constituents with a demonstrative. Conditions: (1) Main clause (2) It must have a subject (3) It must have a finite verb (4) A "D-adverb" recognized by looking at its initial letter |
Changed: | donderdag 8 september 2011 16:16 (created: dinsdag 30 augustus 2011 12:51) |
Query: | { for $search in //eTree[tb:HasLabel(@Label, $_matrixIP)] (: Only take into account clauses with a licit subject :) let $sbjAll := tb:SomeChildNo($search, $_subject, $_nosubject) (: Exclude starred elements and take the first one remaining :) let $sbj := $sbjAll[not(child::eLeaf/@Type = 'Star')][1] (: Get the finite verb :) let $Vfin := ru:one($search, 'FirstChild', $_finiteverb) (: Get the preverbal realm: the constituents preceding $Vfin :) (: Make sure they are not of the "ignore" type and they are not a sub clause :) let $preV := $Vfin/preceding-sibling::eTree[not(tb:Like(@Label, $_ignore_nodes_conj)) and not(some $dsc in descendant-or-self::eTree satisfies (tb:Like($dsc/@Label, $_anyIP)) )] (: Get those preverbal constituents with adverb starting with "D" type :) let $preVd := for $dsc in $preV/descendant-or-self::eTree where (tb:Like($dsc/@Label, $_anyadv) and tb:Like($dsc/child::eLeaf/@Text, $_anyth) and not(tb:Like($dsc/child::eLeaf/@Text, $_anyfalseth)) ) return $dsc (: =============== DEBUGGING ============ let $trc := ru:Trace(concat('preVd=', count($preVd), ' text=', tb:Tree($preVd), '\n')) ====================================== :) (: Prepare a message to be shown in the output :) let $msg := concat('First const d-adv = [', tb:Tree($preVd[1]), ']') (: Stipulate our limitations here :) where ( exists($Vfin) and exists($sbj) and exists($preV) and exists($preVd) ) (: Output providing a message in $msg :) return tb:MyForestMsg($search, $msg) } |
File: | D:\Data files\Corpora\CorpusStudio\Xq\anyDvnw.xq |
Goal: | Find D-words anywhere, no matter where! |
Comment: | The "d-words" are identified by looking for constituents that have NPtype "Dem" or "DemNP". |
Changed: | vrijdag 2 september 2011 15:06 (created: donderdag 1 september 2011 16:35) |
Query: | { for $search in //eTree[tb:Like(@Label, $_anynp)] (: Get the NPtype feature :) let $np := tb:Feature($search, 'NPtype') (: Prepare a message :) let $msg := concat('Dem = ', tb:Tree($search)) (: Stipulate our limitations here :) where ( tb:Like($np, 'Dem|DemNP') ) (: Output providing a message in $msg :) return tb:MyForestMsg($search, $msg) } |
File: | D:\Data files\Corpora\CorpusStudio\Xq\anyDadv.xq |
Goal: | Find D-adverbs anywhere, no matter where! |
Comment: | The "d-adverbs" are identified by looking for constituents that are labelled as Adverb, and whose text starts with a D-character (or equivalent) |
Changed: | vrijdag 2 september 2011 15:06 (created: donderdag 1 september 2011 16:42) |
Query: | { for $search in //eTree[tb:Like(@Label, $_anyadv)] (: Prepare a message :) let $msg := concat('Dadv = ', tb:Tree($search)) (: Stipulate our limitations here :) where ( tb:Like($search/child::eLeaf/@Text, $_anyth) and not(tb:Like($search/child::eLeaf/@Text, $_anyfalseth)) ) (: Output providing a message in $msg :) return tb:MyForestMsg($search, $msg) } |
File: | D:\Data files\Corpora\CorpusStudio\Xq\matSbj-Dadv-Vfin.xq |
Goal: | Find post-subject pre-finite-verb constituents in main clauses containing a D-word that is classified as adverb (bw) |
Comment: | Select and show the constituents with a D-adverb preceding finite verb. (1) Main clause (2) It must have a subject (3) A "D-adverb" recognized by looking at its initial letter (4) The D-adverb follows the subject and precedes the finite verb |
Changed: | donderdag 8 september 2011 16:16 (created: vrijdag 2 september 2011 9:06) |
Query: | { for $search in //eTree[tb:HasLabel(@Label, $_matrixIP)] (: Only take into account clauses with a licit subject :) let $sbjAll := tb:SomeChildNo($search, $_subject, $_nosubject) (: Exclude starred elements and take the first one remaining :) let $sbj := $sbjAll[not(child::eLeaf/@Type = 'Star')][1] (: Get the finite verb :) let $Vfin := ru:one($search, 'FirstChild', $_finiteverb) (: Get the preverbal realm: the constituents preceding $Vfin :) (: Make sure they are not of the "ignore" type and they are not a sub clause :) let $preV := $Vfin/preceding-sibling::eTree[not(tb:Like(@Label, $_ignore_nodes_conj)) and not(some $dsc in descendant-or-self::eTree satisfies (tb:Like($dsc/@Label, $_anyIP)) )] (: Get those preverbal constituents with adverb starting with "D" type :) let $preVd := for $dsc in $preV/descendant::eTree where (tb:Like($dsc/@Label, $_anyadv) and tb:Like($dsc/child::eLeaf/@Text, $_anyth) and not(tb:Like($dsc/child::eLeaf/@Text, $_anyfalseth)) ) return $dsc (: Prepare a message to be shown in the output :) let $msg := concat('First const d-adv = [', tb:Tree($preVd[1]), ']') (: Stipulate our limitations here :) where ( exists($Vfin) and exists($sbj) and exists($preV) and exists($preVd) and ru:relates($sbj, $preVd[1], 'Precedes') ) (: Output providing a message in $msg :) return tb:MyForestMsg($search, $msg) } |
File: | D:\Data files\Corpora\CorpusStudio\Xq\mat-i-Dvnw-Vfin.xq |
Goal: | Find the "immediate" first constituent in main clauses containing a D-word that is classified as pronoun (vnw) |
Comment: | Select and show the first constituents with a demonstrative. (1) Main clause (2) It must have a subject (3) It must have a finite verb (4) Demonstrative is specified by the NPtype (5) first constituent is really first |
Changed: | donderdag 8 september 2011 16:16 (created: dinsdag 6 september 2011 13:39) |
Query: | { for $search in //eTree[tb:HasLabel(@Label, $_matrixIP)] (: Only take into account clauses with a licit subject :) let $sbjAll := tb:SomeChildNo($search, $_subject, $_nosubject) (: Exclude starred elements and take the first one remaining :) let $sbj := $sbjAll[not(child::eLeaf/@Type = 'Star')][1] (: Get the finite verb :) let $Vfin := ru:one($search, 'FirstChild', $_finiteverb) (: Get the preverbal realm: the constituents preceding $Vfin :) (: Make sure they are not of the "ignore" type and they are not a sub clause :) let $preV := $Vfin/preceding-sibling::eTree[not(tb:Like(@Label, $_ignore_nodes_conj)) and not(some $dsc in descendant-or-self::eTree satisfies (tb:Like($dsc/@Label, $_anyIP)) )][last()] (: Get the REALLY first constituent, if it is of the correct type :) let $preVd := $preV[some $dsc in $preV/descendant-or-self::eTree satisfies tb:Like(tb:Feature($dsc, 'NPtype'), 'Dem|DemNP')] (: =============== DEBUGGING ============ let $loc := $search//ancestor::forest/@Location let $trc := ru:Trace(concat($loc, '(vnw): ', if (exists($preVd)) then 'TRUE' else 'FALSE', ' preV=', tb:Tree($preV), ' preVd=', $preVd/@Label, ' text=', tb:Tree($preVd), '\n')) ====================================== :) (: Prepare a message to be shown in the output :) let $msg := concat('First const d-vnw = [', tb:Syntax($preVd[1]), ']') (: Stipulate our limitations here :) where ( exists($Vfin) and exists($sbj) and exists($preV) and exists($preVd) ) (: Output providing a message in $msg :) return tb:MyForestMsg($search, $msg) } |
File: | D:\Data files\Corpora\CorpusStudio\Xq\mat-i-Dadv-Vfin.xq |
Goal: | Find the "really" first constituent in main clauses containing a D-word that is classified as adverb (bw) |
Comment: | Select and show the first constituents with a demonstrative. (1) Main clause (2) It must have a subject (3) It must have a finite verb (4) A "D-adverb" recognized by looking at its initial letter (5) First constituent must be "really" first |
Changed: | donderdag 8 september 2011 16:16 (created: dinsdag 6 september 2011 13:42) |
Query: | { for $search in //eTree[tb:HasLabel(@Label, $_matrixIP)] (: Only take into account clauses with a licit subject :) let $sbjAll := tb:SomeChildNo($search, $_subject, $_nosubject) (: Exclude starred elements and take the first one remaining :) let $sbj := $sbjAll[not(child::eLeaf/@Type = 'Star')][1] (: Get the finite verb :) let $Vfin := ru:one($search, 'FirstChild', $_finiteverb) (: Get the preverbal realm: the constituents preceding $Vfin :) (: Make sure they are not of the "ignore" type and they are not a sub clause :) let $preV := $Vfin/preceding-sibling::eTree[not(tb:Like(@Label, $_ignore_nodes_conj)) and not(some $dsc in descendant-or-self::eTree satisfies (tb:Like($dsc/@Label, $_anyIP)) )][last()] (: Get the REALLY first constituent, if it is of the correct type :) let $preVd := $preV[some $dsc in $preV/descendant-or-self::eTree satisfies (tb:Like($dsc/@Label, $_anyadv) and tb:Like($dsc/child::eLeaf/@Text, $_anyth) and not(tb:Like($dsc/child::eLeaf/@Text, $_anyfalseth)) )] (: =============== DEBUGGING ============ let $loc := $search//ancestor::forest/@Location let $trc := ru:Trace(concat($loc, '(adv): ', if (exists($preVd)) then 'TRUE' else 'FALSE', ' preV=', tb:Tree($preV), ' preVd=', $preVd/@Label, ' text=', tb:Tree($preVd), '\n')) ====================================== :) (: Prepare a message to be shown in the output :) let $msg := concat('First const d-adv = [', tb:Tree($preVd[1]), ']') (: Stipulate our limitations here :) where ( exists($Vfin) and exists($sbj) and exists($preV) and exists($preVd) ) (: Output providing a message in $msg :) return tb:MyForestMsg($search, $msg) } |
File: | D:\Data files\Corpora\CorpusStudio\Xq\mat-i-DvnwSbj-Vfin.xq |
Goal: | Find the "immediate" subject first constituent in main clauses containing a D-word that is classified as pronoun (vnw) |
Comment: | Select and show the first constituents with a demonstrative. (1) Main clause (2) It must have a subject (3) It must have a finite verb (4) Demonstrative is specified by the NPtype (5) first constituent is really first (6) First constituent is the subject |
Changed: | donderdag 8 september 2011 16:16 (created: donderdag 8 september 2011 13:12) |
Query: | { for $search in //eTree[tb:HasLabel(@Label, $_matrixIP)] (: Only take into account clauses with a licit subject :) let $sbjAll := tb:SomeChildNo($search, $_subject, $_nosubject) (: Exclude starred elements and take the first one remaining :) let $sbj := $sbjAll[not(child::eLeaf/@Type = 'Star')][1] (: Get the finite verb :) let $Vfin := ru:one($search, 'FirstChild', $_finiteverb) (: Get the preverbal realm: the constituents preceding $Vfin :) (: Make sure they are not of the "ignore" type and they are not a sub clause :) let $preV := $Vfin/preceding-sibling::eTree[not(tb:Like(@Label, $_ignore_nodes_conj)) and not(some $dsc in descendant-or-self::eTree satisfies (tb:Like($dsc/@Label, $_anyIP)) )][last()] (: Get the REALLY first constituent, if it is of the correct type :) let $preVd := $preV[some $dsc in $preV/descendant-or-self::eTree satisfies tb:Like(tb:Feature($dsc, 'NPtype'), 'Dem|DemNP')] (: =============== DEBUGGING ============ let $loc := $search//ancestor::forest/@Location let $trc := ru:Trace(concat($loc, '(vnw): ', if (exists($preVd)) then 'TRUE' else 'FALSE', ' preV=', tb:Tree($preV), ' preVd=', $preVd/@Label, ' text=', tb:Tree($preVd), '\n')) ====================================== :) (: Prepare a message to be shown in the output :) let $msg := concat('First const d-vnw = [', tb:Syntax($preVd[1]), ']') (: Stipulate our limitations here :) where ( exists($Vfin) and exists($sbj) and exists($preV) and exists($preVd) and (tb:Feature($preVd, 'GrRole') = 'Subject') ) (: Output providing a message in $msg :) return tb:MyForestMsg($search, $msg) } |