The Limits of Co-currence Analyses

The Limits of Co-currence Analyses (Région de Bourgogne)
Robert French & Valerie Camos, Project Co-coordinators
Total Amount of the Grant: 139.500 euros
Duration: 1 year

The overall objective of this project is to study new applications for, as well as the limits of, text co-occurrence programs, such as, LSA (Landauer and Dumais, 1997), HAL (Lund & Burgess, 1996), and PMI-IR (Turney, 2001). French & Labiouse (2002) pointed out four problems with word co-occurrence programs — namely, i) the difficulties caused by the intrinsic deformability of semantic space, ii) the current inability of these programs to detect co-occurrences of abstract/relational structure, especially, especially distal relational structure, iii) their lack of essential world knowledge (e.g., fathers are always men, mothers always women), acquired by humans through learning or direct experience with the world and, finally, iv) their assumption of the atomic nature of words. We hope to use techniques drawn from statistics, from linguistics and from analogy-making to explore these four problems and to determine to what extent they can (or cannot) be handled by current co-occurrence programs. In addition, within these limits, we hope to examine new possibilities for the use of these types of programs.