Field Description

 

 

 

Fieldnames and descriptions of the files
(click on filename to access description)

File Type
FileName
Lexical Databases by Educational Level

manu1 - manu2 - manu35 - manuAll

Orthographic (no*) and Phonographic Neighborhoods (nop*) by Educational Level

no1 - no2 - no35 - noAll

nop1 - nop2 - nop35 -nopAll

Homographic Heterophones (hg*) and Heterographic Homophones (hp*) by Educational Level

hg1 - hg2 - hg35 - hgAll

hp1 - hp2 - hp35 - hpAll

Infra lexical Tables

letter
phonem
bigram
trigram
biphone
syllable
GP
PG

(1: first grade; 2: second grade; 35: grades 3 to 5; All: grades 1 to 5)


To allow importation of the files in most of the softwares by the users, the name of the fields were restricted to a maximum of 8 characters. The following syntax is used for database fields reporting computation results:
1. Type of computation (nb: number; fr: frequency; co: consistency);
2. Object of the analysis (e. g., BIG for bigrams, SYL for syllables, GP for Grapheme-phoneme associations, ON for orthographic neighborhood);
3. Serial position in the word (when it applies; i: Initial, m: Middle, f: Final, t: Total);
4. Type of count (TY: by type; TO: by token)

 

Filenames : manu1, manu2, manu35, manuAll

ortho : orthographic code
phon : phonological code (phonetic codes)
synt : syntactic class (NC: noun; NP: proper name; VER: verb; ADJ: adjective; PRO: pronoun; PRE: preposition; CON: conjunction; DET: determiner)

u : word frequency (frequency per million words) as described in the Manulex database

psyll : syllabic segmentation ('.' as separators; segmentation details)
nbsyll : number of syllables

gseg : graphemic segmentation ('.' as separators; segmentation details)
pseg : phonemic segmentation matching the graphemic one

gpmatch: grapheme-phoneme associations. This field allows to find words including a particular association; "-" between grapheme and corresponding phoneme, "." between grapheme-phoneme associations (e. g., (ch-S.a-a.r-R) for the word 'char' /SaR/). The leftmost character is a '(' that indicates the beginning of the word. The rightmost character is a ')' that indicates word ending. These two characters can be used to find words including grapheme-phoneme association specifically at the begining or at the end of the words (e. g., searching with '(ch-S.' or '.ch-S)' provides the list of words including the ch-S association at the beginning and at the end of the words, respectively.

nbLET : number of letters
nbPHON : number of phonemes
nbGRAPH : number of graphemes

puortho : orthographic unicity point

Homophones and Homographs

nbHPty : number of HomoPhones, Type count
nbHGty : number of HomoGraphs, Type count
nbHPNGty : number of HomoPhones Not homoGraphic, Type count
nbHGNPty : number of HomoGraphs Not homoPhonic, Type count

nbHPto : number of HomoPhones, Token count
nbHGto : number of HomoGraphs, Token count
nbHPNGto : number of HomoPhones Not homoGraphic, Token count
nbHGNPto : number of HomoGraphs not homoPhonic, Token count

Bigrams and Biphones

frBIGtty : mean BIGram frequency, Type count
frBIGtto : mean BIGram frequency, Token count
frBIGity : frequency of the Initial BIGram, Type count
frBIGito : frequency of the Initial BIGram, Token count
frBIGmty : mean frequency of the middle BIGrams, Type count
frBIGmto : mean frequency of the middle BIGrams, Token count
frBIGfty : frequency of the Final BIGram, Type count
frBIGfto : frequency of the Final BIGram, Token count

frBIPtty : mean BIPhone frequency, Type count
frBIPtto : mean BIPhone frequency, Token count
frBIPity : frequency of the Initial BIPhone, Type count
frBIPito : frequency of the Initial BIPhone, Token count
frBIPmty : mean frequency of the Middle BIPhones, Type count
frBIPmto : mean frequency of the Middle BIPhones, Token count
frBIPfty : frequency of the Final BIPhones, Type count
frBIPfto : frequency of the Final BIPhones, Token count

Syllables

frSYLity : frequency of the Inital SYLlable, Type count
frSYLito : frequency of the Initial SYLlable, Token count
frSYLmty : mean frequency of the Middle SYLlables, Type count
frSYLmto : mean frequency of the Middle SYLlables, Token count
frSYLfty : frequency of the final SYLlable, Type count
frSYLfto : frequency of the final SYLlable, Token count

 

Lexical neighborhood

nbONty : number of Orthographic Neighbors, Type count
nbONto : number of Orthographic Neighbors, Token count
nbPGNty : number of PhonoGraphic Neighbors (phonological AND orthographic neighbors), Type count
nbPGNto : number of PhonoGraphic Neighbors (phonological AND orthographic neighbors), Token count

 

Mean Frequency and Consistency of Grapheme-Phoneme and Phoneme-Grapheme Associations

frGPtty : mean frequency of Grapheme-Phoneme associations, Type count
frGPtto : mean frequency of Grapheme-Phoneme associations, Token count
(see note below)
coGPtty : mean consistency of Grapheme-Phoneme associations, Type count
coGPtto : mean consistency of Grapheme-Phoneme associations, Token count
coPGtty : mean consistency of Phoneme-Grapheme associations, Type count
coPGtto : mean consistency of Phoneme-Grapheme associations, Token count

 

Minimal Frequency and Consistency of Grapheme-Phoneme and Phoneme-Grapheme Associations

frGPmity : frequency of the Grapheme-Phoneme associations having the Minimal value on Type count
frGPmito : frequency of the Grapheme-Phoneme associations having the Minimal value on Token count
(see note below)
coGPmity : consistency of the Grapheme-Phoneme associations having the Minimal value on Type count
coGPmito : consistency of the Grapheme-Phoneme associations having the Minimal value on Token count
coPGmity : consistency of the Phoneme-Grapheme associations having the Minimal value on Type count
coPGmito :consistency of the Phoneme-Grapheme associations having the Minimal value on Token count

Frequency and Consistency (multiplied by 100) of Grapheme-Phoneme and Phoneme-Grapheme Associations by Position (initial, middle, final)

frGPity : frequency of the Inital Grapheme-Phoneme association, Type count
frGPito : frequency of the Initial Grapheme-Phoneme association, Token count
frGPmty : mean frequency of the middle Grapheme-Phoneme associations, Type count
frGPmto : mean frequency of the middle Grapheme-Phoneme associations, Token count
frGPfty : frequency of the Final Grapheme-Phoneme association, Type count
frGPfto : frequency of the Final Grapheme-Phoneme association, Token count
(see note below)

coGPity : consistency of the Inital Grapheme-Phoneme association, Type count
coGPito : consistency of the Initial Grapheme-Phoneme association, Token count
coGPmty : mean consistency of the middle Grapheme-Phoneme associations, Type count
coGPmto : mean consistency of the middle Grapheme-Phoneme associations, Token count
coGPfty : consistency of the Final Grapheme-Phoneme association, Type count
coGPfto : consistency of the Final Grapheme-Phoneme association, Token count

coPGity : consistency of the Inital Phoneme-Grapheme association, Type count
coPGito : consistency of the Initial Phoneme-Grapheme association, Token count
coPGmty : mean consistency of the middle Phoneme-Grapheme associations, Type count
coPGmto : mean consistency of the middle Phoneme-Grapheme associations, Token count
coPGfty : consistency of the Final Phoneme-Grapheme association, Type count
coPGfto : consistency of the Final Phoneme-Grapheme association, Token count

(notes. frequency of Phoneme-Grapheme associations identical to frequency of Grapheme-Phoneme associations; all by-token values computed using the U Frequency Index from Manulex)

up

Filenames : no1, no2, no35, noall, nop1, nop2, nop35, nopall

ortho : word orthography

n1 to n25 : orthographic representation of the orthographic neighbors (in NO file) or phonographic neighbors (in NOP file)

f1 to f15 : corresponding word frequency (i. e., f1 corresponds to n1, f2 to n2, …) using the U frequency index

up

Filenames : hp1, hp2, hp35, hpall, hg1, hg2, hg35, hgall

ortho : word orthography
phon : word phonology

h1 to h15 : orthographic representation of the heterographic homophones (in HP file) or heterophonic homographs (in HG file)

f1 to f15 : corresponding word frequency (i. e., f1 corresponds to h1, f2 to h2, …) using the U frequency index

up

Filename : letter

unit : letter

fri1 : letter FRequency for Initial letters of words, for Level 1
fri2 : id. for Level 2
fri35 : id. for Levels 3 to 5
friAll : id. for all Levels (1 to 5)

frm1 : letter FRequency for middle letters of words, for Level 1
frm2 : id. for Level 2
frm35 : id. for Levels 3 to 5
frmAll : id. for all Levels (1 to 5)

frf1 : letter FRequency for Final letters of words, for Level 1
frf2 : id. for Level 2
frf35 : id. for Levels 3 to 5
frfAll : id. for all Levels (1 to 5)

Notes. Isolated letters are counted as Initial letters. For two-letter words, the first letter is counted as Initial, the last one as Final (no middle letter). All counts only by Type (i. e. absolute number of words including the letters, irrespectively of word frequency).

up

Filename : phonem

unit : phoneme (see the phonetic characters)

fri1 : phoneme FRequency for Initial phonemes of words, for Level 1
fri2 : id. for Level 2
fri35 : id. for Levels 3 to 5
friAll : id. for all Levels (1 to 5)

frm1 : phoneme FRequency for middle phonemes of words, for Level 1
frm2 : id. for Level 2
frm35 : id. for Levels 3 to 5
frmAll : id. for all Levels (1 to 5)

frf1 : phoneme FRequency for Final phonemes of words, for Level 1
frf2 : id. for Level 2
frf35 : id. for Levels 3 to 5
frfAll : id. for all Levels (1 to 5)

Notes. Isolated phonemes are counted as Initial phonemes. For two-phoneme words, the first phoneme is counted as Initial, the last one as Final (no middle phoneme). All counts only by Type (i. e. absolute number of words including the phonemes, irrespectively of word frequency).

up

Filename : bigram

unit : bigram

fri1ty : bigram FRequency for Initial bigrams of words, for Level 1, Type count
fri2ty : id. for Level 2
fri35ty : id. for Levels 3 to 5
friAllty : id. for all Levels (1 to 5)

frm1ty : bigram FRequency for middle bigrams of words, for Level 1, Type count
frm2ty : id. for Level 2
frm35ty : id. for Levels 3 to 5
frmAllty : id. for all Levels (1 to 5)

frf1ty : bigram FRequency for Final bigrams of words, for Level 1, Type count
frf2ty : id. for Level 2
frf35ty : id. for Levels 3 to 5
frfAllty : id. for all Levels (1 to 5)

fri1to : bigram FRequency for Initial bigrams of words, for Level 1, TOken count
fri2to : id. for Level 2
fri35to : id. for Levels 3 to 5
friAllto : id. for all Levels (1 to 5)

frm1to : bigram FRequency for middle bigrams of words, for Level 1, TOken count
frm2to : id. for Level 2
frm35to : id. for Levels 3 to 5
frmAllto : id. for all Levels (1 to 5)

frf1to : bigram FRequency for Final bigrams of words, for Level 1, TOken count
frf2to : id. for Level 2
frf35to : id. for Levels 3 to 5
frfAllto : id. for all Levels (1 to 5)

Notes. Isolated bigrams are counted as Initial bigrams. For two-bigram words, the first bigram is counted as Initial, the last one as Final (no middle bigram). All by-token values computed using the U Frequency Index from Manulex)

up

Filename : trigram

unit : trigram

fri1ty : trigram FRequency for Initial trigrams of words, for Level 1, Type count
fri2ty : id. for Level 2
fri35ty : id. for Levels 3 to 5
friAllty : id. for all Levels (1 to 5)

frm1ty : trigram FRequency for middle trigrams of words, for Level 1, Type count
frm2ty : id. for Level 2
frm35ty : id. for Levels 3 to 5
frmAllty : id. for all Levels (1 to 5)

frf1ty : trigram FRequency for Final trigrams of words, for Level 1, Type count
frf2ty : id. for Level 2
frf35ty : id. for Levels 3 to 5
frfAllty : id. for all Levels (1 to 5)

fri1to : trigram FRequency for Initial trigrams of words, for Level 1, TOken count
fri2to : id. for Level 2
fri35to : id. for Levels 3 to 5
friAllto : id. for all Levels (1 to 5)

frm1to : trigram FRequency for middle trigrams of words, for Level 1, TOken count
frm2to : id. for Level 2
frm35to : id. for Levels 3 to 5
frmAllto : id. for all Levels (1 to 5)

frf1to : trigram FRequency for Final trigrams of words, for Level 1, TOken count
frf2to : id. for Level 2
frf35to : id. for Levels 3 to 5
frfAllto : id. for all Levels (1 to 5)

Notes. Isolated trigrams are counted as Initial trigrams. For two-trigram words, the first trigram is counted as Initial, the last one as Final (no middle trigram). All by-token values computed using the U Frequency Index from Manulex)

up

Filename : biphone

unit : biphone

fri1ty : biphone FRequency for Initial biphones of words, for Level 1, Type count
fri2ty : id. for Level 2
fri35ty : id. for Levels 3 to 5
friAllty : id. for all Levels (1 to 5)

frm1ty : biphone FRequency for middle biphones of words, for Level 1, Type count
frm2ty : id. for Level 2
frm35ty : id. for Levels 3 to 5
frmAllty : id. for all Levels (1 to 5)

frf1ty : biphone FRequency for Final biphones of words, for Level 1, Type count
frf2ty : id. for Level 2
frf35ty : id. for Levels 3 to 5
frfAllty : id. for all Levels (1 to 5)

fri1to : biphone FRequency for Initial biphones of words, for Level 1, TOken count
fri2to : id. for Level 2
fri35to : id. for Levels 3 to 5
friAllto : id. for all Levels (1 to 5)

frm1to : biphone FRequency for middle biphones of words, for Level 1, TOken count
frm2to : id. for Level 2
frm35to : id. for Levels 3 to 5
frmAllto : id. for all Levels (1 to 5)

frf1to : biphone FRequency for Final biphones of words, for Level 1, TOken count
frf2to : id. for Level 2
frf35to : id. for Levels 3 to 5
frfAllto : id. for all Levels (1 to 5)

Notes. Isolated biphones are counted as Initial biphones. For two-biphone words, the first biphone is counted as Initial, the last one as Final (no middle biphone). All by-token values computed using the U Frequency Index from Manulex)

up

Filename : syllable

unit : syllable

fri1ty : syllable FRequency for Initial syllables of words, for Level 1, Type count
fri2ty : id. for Level 2
fri35ty : id. for Levels 3 to 5
friAllty : id. for all Levels (1 to 5)

frm1ty : syllable FRequency for middle syllables of words, for Level 1, Type count
frm2ty : id. for Level 2
frm35ty : id. for Levels 3 to 5
frmAllty : id. for all Levels (1 to 5)

frf1ty : syllable FRequency for Final syllables of words, for Level 1, Type count
frf2ty : id. for Level 2
frf35ty : id. for Levels 3 to 5
frfAllty : id. for all Levels (1 to 5)

fri1to : syllable FRequency for Initial syllables of words, for Level 1, TOken count
fri2to : id. for Level 2
fri35to : id. for Levels 3 to 5
friAllto : id. for all Levels (1 to 5)

frm1to : syllable FRequency for middle syllables of words, for Level 1, TOken count
frm2to : id. for Level 2
frm35to : id. for Levels 3 to 5
frmAllto : id. for all Levels (1 to 5)

frf1to : syllable FRequency for Final syllables of words, for Level 1, TOken count
frf2to : id. for Level 2
frf35to : id. for Levels 3 to 5
frfAllto : id. for all Levels (1 to 5)

Notes. Isolated syllables are counted as Initial syllables. For two-syllable words, the first syllable is counted as Initial, the last one as Final (no middle syllable). All by-token values computed using the U Frequency Index from Manulex)

up

Filenames : gp, pg


The gp and pg files have identical structures. The GP file concerns Grapheme-Phoneme mappings. The PG file concerns Phoneme-Grapheme mappings.

ortho : grapheme
phon : phoneme

oexample: orthographic code of word examples
pexample: phonological code of word examples

frt1ty : FRequency (Total) of the mappings, for Level 1, TYpe count
fri1ty : id. considering only Initial mappings of the words
frm1ty : id. considering only middle mappings
frf1ty : id. considering only Final mappings of the words

frt1to : FRequency (Total) of the mappings, for Level 1, TOken count
fri1to : id. considering only Initial mappings of the words
frm1to : id. considering only middle mappings
frf1to : id. considering only Final mappings of the words

cot1ty : COnsistency (Total) of the mappings, for Level 1, TYpe count
coi1ty : id. considering only Initial mappings of the words
com1ty : id. considering only middle mappings
cof1ty : id. considering only Final mappings of the words

cot1to : COnsistency (Total) of the mappings, for Level 1, TOken count
coi1to : id. considering only Initial mappings of the words
com1to : id. considering only middle mappings
cof1to : id. considering only Final mappings of the words

frt2ty : FRequency (Total) of the mappings, for Level 2, TYpe count
fri2ty : id. considering only Initial mappings of the words
frm2ty : id. considering only middle mappings
frf2ty : id. considering only Final mappings of the words

frt2to : FRequency (Total) of the mappings, for Level 2, TOken count
fri2to : id. considering only Initial mappings of the words
frm2to : id. considering only middle mappings
frf2to : id. considering only Final mappings of the words

cot2ty : COnsistency (Total) of the mappings, for Level 2, TYpe count
coi2ty : id. considering only Initial mappings of the words
com2ty : id. considering only middle mappings
cof2ty : id. considering only Final mappings of the words

cot2to : COnsistency (Total) of the mappings, for Level 2, TOken count
coi2to : id. considering only Initial mappings of the words
com2to : id. considering only middle mappings
cof2to : id. considering only Final mappings of the words

frt35ty : FRequency (Total) of the mappings, for Levels 3-5, TYpe count
fri35ty : id. considering only Initial mappings of the words
frm35ty : id. considering only middle mappings
frf35ty : id. considering only Final mappings of the words

frt35to : FRequency (Total) of the mappings, for Levels 3-5, TOken count
fri35to : id. considering only Initial mappings of the words
frm35to : id. considering only middle mappings
frf35to : id. considering only Final mappings of the words

cot35ty : COnsistency (Total) of the mappings, for Levels 3-5, TYpe count
coi35ty : id. considering only Initial mappings of the words
com35ty : id. considering only middle mappings
cof35ty : id. considering only Final mappings of the words

cot35to : COnsistency (Total) of the mappings, for Levels 3-5, TOken count
coi35to : id. considering only Initial mappings of the words
com35to : id. considering only middle mappings
cof35to : id. considering only Final mappings of the words

frtalty : FRequency (Total) of the mappings, for All levels, TYpe count
frialty : id. considering only Initial mappings of the words
frmalty : id. considering only middle mappings
frfalty : id. considering only Final mappings of the words

frtalto : FRequency (Total) of the mappings, for All levels, TOken count
frialto : id. considering only Initial mappings of the words
frmalto : id. considering only middle mappings
frfalto : id. considering only Final mappings of the words

cotalty : COnsistency (Total) of the mappings, for All levels, TYpe count
coialty : id. considering only Initial mappings of the words
comalty : id. considering only middle mappings
cofalty : id. considering only Final mappings of the words

cotalto : COnsistency (Total) of the mappings, for All levels, TOken count
coialto : id. considering only Initial mappings of the words
comalto : id. considering only middle mappings
cofalto : id. considering only Final mappings of the words

Note. Isolated graphemes are counted as Initial graphemes. For two-grapheme words, the first grapheme is counted as Initial, the last one as Final (no middle grapheme). All by-token values computed using the U Frequency Index from Manulex). Frequency of the mappings does not necessarily correspond to the number of words including the mapping as words can include several identical mappings in middle position.

up