Curriculum Vitae of Prof. Dr. Piek Th.J.M. Vossen

Piek Vossen (1960) is currently fulltime Professor Computational Lexicology at the Faculty of Arts, department Language, Cognition and Communication (LCC) at the Vrije Universiteit Amsterdam. and head of the Computational Lexicology & Terminology Lab (CLTL).

Piek Vossen studied Dutch and General Linguistics at the University of Amsterdam. In 1995, he received his PhD (cum laude) in Linguistics on Computational Lexicology and Lexicography. He has been involved in the following (EU-)projects: Links, Acquilex-I and II, Sift, EuroWordNet I and II, Meaning, Euroterm, Balkanet and Pidgin, Cornetto, DutchSemCor, Text to Politics, Semantics of History and the EC-project KYOTO. He is involved in further standardisation initiatives, such as the Ansi committee for ontology standardisation, the EAGLES/ILSE project and FLaReNet and CLARIN-NL. He is also Co-Founder and Co-President of the Global WordNet Association.

His PhD thesis, entitled "Grammatical and Conceptual Individuation in the Lexicon" (published by IFOTT, Amsterdam) describes a new model, the so-called Anchored Relational Model, for defining the syntactic and semantic properties of English and Dutch nouns based on differences in their pragmatic use. Using formal computer representations, the semantic and syntactic properties can be correlated in complex ways, making a further distinction between a cognitive level of meaning and a lexical semantic level (See also review in International Journal of Lexicography, 1998; 11: 73 - 79).

From 1986-1998 Vossen has been senior reseacher at the University of Amsterdam where he was responsible for a series of research projects, both national (LINKS) and international (ACQUILEX I, ACQUILEX II, SIFT). Vossen was projectcoordinator of the EuroWordNet I and II-project. The aim of the EuroWordNet project was to develop a multilingual database with wordnets for Dutch, Italian, Spanish, French, German, Czech, Estonian and English. In this database, the concepts in the wordnets are interconnected via an Inter-Lingual-Index. The EuroWordNet model is now used by many other groups and projects to build wordnets for their languages and to link them to the database.

From 1999-2001 Vossen has been senior researcher and manager at Sail Labs, Antwerp, a long-term research laboratory developing language-technology of the future and responsible for developing novel technologies and strategies. Core technologies were: cross-lingual information retrieval, classification, semantic ontologies and wordnet, disambiguation, corpus-based tuning and tailoring of resources and application.

From 2001-2009 Vossen has been the C.T.O. of Irion Technologies B.V., where he developed multilingual language technology for many different languages. The technology combined state-of-the-art statistical engines developed at TNO with intelligent language modules. It used a rich semantic network for 7 languages that is unique in the world, which made it possible to develop concept-based information and knowledge management systems. Main products were: a cross-lingual concept-based search engine, a document classification system, and a question-answer system. All products are available for English, Dutch, German, French, Italian and Spanish.

In April 2006 (fulltime since November 2009), Vossen has been appointed as Professor Computational Lexicology at the Faculty of Arts, department Language, Cognition and Communication (LCC) at the Vrije Universiteit Amsterdam. He is head of the Computational Lexicology & Terminology Lab (CLTL). In this function he is also a member of the Research Center for Advanced Media Research Amsterdam CAMeRA and a member of the Netherlands Graduate School of Linguistics LOT (Landelijke Onderzoekschool Taalwetenschap). His research interests are WordNets, Computational Lexicon, Ontologies, Computational Linguistics, Language Technology and Computer-Applications. Research on wordnets and computational lexicons, both within a single language and from a multilingual perspective. Vossen is interested in the relation between lexicons and ontologies, from a theoretical point of view as well as from their usage in computer-applications in which meaning and interpretation play a role. He sees the lexicon as a fundamental resource to anchor meaning and interpretation in useful computer behaviour. Computer behaviour can make use of communicative models and insights from communication science. The organization of the lexicon and the knowledge stored in it need to take that usage as a starting point.

Vossen has published more than hundred articles in national and international journals, conference proceedings and (hand)books. He has given invited lectures at several conferences and other occasions, is a regular referee for (inter)national conferences and journals, and has served on several program and organizing committees. He also serves as a member of PhD-committees, in the Netherlands as well as abroad. He was the writer and editor of the book "EuroWordNet: a multilingual database with lexical semantic networks" (Kluwer Academic Publishers, 1998) and wrote several chapters in prominent Handbooks in the fields of Ontologies, Linguistics, Language, Lexicography etc.

Since 2000, he is Co-Founder and Co-President (with Christiane Fellbaum) of the Global WordNet Association and in this function he organized the First Global Wordnet Conference (India, 2002), the Second Global Wordnet Conference (Tsjechië, 2004), the Third Global Wordnet Conference (Korea, 2006), the Fourth Global Wordnet Conference (Hungary, 2008) and the Fifth Global Wordnet Conference (India, 2010).

In February 2006, the idea of the Global Wordnet Grid was launched at the 3rd GWC in Jeju, Korea: the building of a complete free worldwide wordnet grid. This grid will be build around a shared set of concepts, such as the Common Base Concepts used in many wordnet projects. These concepts will be expressed in terms of Wordnet synsets and SUMO definitions. People from all language communities are invited to upload synsets from their language to the Grid. Gradually, the Grid will then be represented by all languages. The Grid will be available to everybody and will be distributed completely free.

In 2007 Vossen was also one of the initiators/organizers of IWIC2007, the The First International Workshop on Intercultural Collaboration (Kyoto, 2007), whereafter a website is launched called Intercultural Collaboration Gateway, with the aim of gathering information about intercultural collaboration.

He is invited as expert consultant for many (EU-) projects such as: "Van Dale Groot Woordenboek der Nederlandse Taal", "Corpus Gesproken Nederlands", "Euroterm" and "Balkanet". He is also a member of the NWO - IMIX (Interactieve Multimodale Informatie Extractie)-program and the STEVIN (Spraak-en Taaltechnologische Essentiële Voorzieningen In het Nederlands)-Program from the Nederlandse Taalunie, NWO, AWI, and the FWO and IWT.

He is also responsible for the European part of the project "Constructing Arabic Wordnet (AWN) in Parallel with an Ontology", sponsored by the American government and headed by Princeton University. AWN will build a lexical resource in Standard Arabic and it will be constructed according to the methods developed for EuroWordNet and since applied to dozens of languages around the world. The EuroWordNet approach maximizes compatibility across wordnets and focuses on manual encoding of the most complicated and important concepts.

From 2006-2009, he was the project coordinator of "Cornetto" (Combinatorial Relational Network for Language Applications), financially supported by the Dutch-Flemish Language Union. Cornetto built a lexical semantic database for Dutch with rich vertical and horizontal semantic relations and combinatorial lexical constraints such as multiword expressions, idioms and collocations on the one hand, and lexical functions and frames on the other. The concepts will be aligned with the English Wordnet so that ontologies and domain labels can be imported. The semantic layer validated with a formal ontology, to make it usable in Semantic Web environments. Cornetto covers 40K entries, including the most generic and central part of the language. The database goes beyond the structure and content of Wordnet and FrameNet. Cornetto was an initiative of Vossen and the Free University of Amsterdam to combine the Dutch wordnet and the Referentie Bestand Nederlands (a Dutch database with combinatoric information of Dutch word meanings) in a unique resource for Dutch. He will also be a consulant for the Nederlandse TaalUnie to revise the Referentie Bestand Nederlands.

As of March 2008 Vossen is the project coordinator of the 7th EU Framework Project "Knowledge Yielding Ontologies for Transition-based Organization" "KYOTO" (area Digital Libraries). KYOTO makes knowledge sharable between communities of people, culture, languages and computers, by assigning meaning to text and giving text to meaning by the development of a cross-lingual and cross-cultural knowledge and information transition system that is applied to the domain of the environment in Dutch, English, Italian, Spanish, Chinese and Japanese. The goal of KYOTO is a system that allows people in communities to define the meaning of their words and terms in a shared Wiki platform so that it becomes anchored across languages and cultures but also so that a computer can use this knowledge to detect knowledge and facts in text. Whereas the current Wikipedia uses free text to share knowledge, KYOTO will represent this knowledge so that a computer can understand it. In February the 1st International Kyoto Workshop was organized by Vossen in Artis Amsterdam, the Netherlands.

Vossen is one of the members of FLaReNet (Fostering Language Resources Network), another 7th EU Framework Project. FLaReNet will discuss upon critical issues for the future development, deployment and use of LRs and will indicate best practices and best policies for coordinating future actions and projects. The major activities of the Network will be to survey, analyse, classify LRs and relevant standards, together with their organisational and economic models, and discuss with major stakeholders and players upon new common strategies for a capillary deployment and use of LRs in real-world products.

As of September 2009, Vossen is one of the four members of the Management Team of CAMeRA: the Interfaculty research institute. The Scientific Advisory Board consists of prominent members from the four (i.e. Sociology, Computer Science, Clininal Psychology and Language & Communcation) participating faculties and research groups.

Since May 2009 Vossen is also a member of the National Advisory Panel of CLARIN-NL. CLARIN-NL aims to design, construct, validate, and exploit a research infrastructure that is needed to provide a sustainable and persistent eScience working environment for researchers in the Humanities, and Linguistics in particular, who want to make use of language resources and the technology to use these resources for their research. The CLARIN-NL project is a large national project in the Netherlands which aims to play a central role in the Europe-wide CLARIN infrastructure (CLARIN-EU). CLARIN-NL offers scholars the tools to allow computer-aided language processing, addressing one or more of the multiple roles language plays (i.e. carrier of cultural content and knowledge, instrument of communication, component of identity and object of study) in the Humanities and Social Sciences.

In July 2009, Vossen has been invited as a member of the Advisory Board of the Taalbank (GTB) of the Institute of Dutch Lexicography (INL).

As of September 2009, Vossen is coordinating a new NWO-funded project DutchSemcor. The goal of DutchSemCor is to deliver a one-million word Dutch corpus that is fully sense-tagged with senses and domain tags from the Cornetto database (STEVIN project STE05039). The corpus, for which we aim to offer the same balance in types of text as these basic resources, will be extremely rich in terms of lexical semantic information. Its availability will enable many new lines of research and technology developments for the Dutch language. In particular, it will enable research into the relation between language form and language interpretation, and as such it will be applicable in the fields of cognitive science, (psycho-)linguistics, language learning and language teaching, semantic web applications, information retrieval, machine translation, text mining, and document interpretation (summarization, topic segmentation). We foresee that the corpus will create new directions of research and technology development on a par with current developments for English.

In September 2009 two research projects has been been launched, funded by CAMeRA: the Interfaculty research institute : one project 'Semantics of History' and another project 'From sentiments and opinions in texts to positions of political parties' (Text2Politics). The Semantics of History-project develops a historical ontology and a lexicon that are used in a new type of information system that can handle the time-based dynamics and varying perspectives in historical archives. Text2Politics combines contemporary theories and methods in linguistics and political science to develop an automated research tool for rich text-mining. The transdisciplinary relevance of the project is that a carefully constructed mining tool for language-meaning research can be applied to enhance the Kieskompas (Electoral Compass) and prove useful in the social sciences in general. The research will give new insights into the complexity of language use, the linguistic modeling of subjectivity and the representation of this knowledge in a lexicon. It will also shed new light on the complex dimensionality of competition between political parties.

Created by A.Weisscher at www.weisscher.info