Essential Speech and Language Technology for Dutch: Results by the STEVIN-programme
Peter Spyns, Jan Odijk
The book provides an overview of more than a decade of joint R&D efforts in the Low Countries on HLT for Dutch. It not only presents the state of the art of HLT for Dutch in the areas covered, but, even more importantly, a description of the resources (data and tools) for Dutch that have been created are now available for both academia and industry worldwide. The contributions cover many areas of human language technology (for Dutch): corpus collection (including IPR issues) and building (in particular one corpus aiming at a collection of 500M word tokens), lexicology, anaphora resolution, a semantic network, parsing technology, speech recognition, machine translation, text (summaries) generation, web mining, information extraction, and text to speech to name the most important ones. The book also shows how a medium-sized language community (spanning two territories) can create a digital language infrastructure (resources, tools, etc.) as a basis for subsequent R&D. At the same time, it bundles contributions of almost all the HLT research groups in Flanders and the Netherlands, hence offers a view of their recent research activities. Targeted readers are mainly researchers in human language technology, in particular those focusing on Dutch. It concerns researchers active in larger networks such as the CLARIN, META-NET, FLaReNet and participating in conferences such as ACL, EACL, NAACL, COLING, RANLP, CICling, LREC, CLIN and DIR ( both in the Low Countries), InterSpeech, ASRU, ICASSP, ISCA, EUSIPCO, CLEF, TREC, etc. In addition, some chapters are interesting for human language technology policy makers and even for science policy makers in general.