By Thiago Alexandre Salgueiro Pardo, Lucas Antiqueira, Maria das Graças Volpe Nunes (auth.), Renata Vieira, Paulo Quaresma, Maria das Graças Volpe Nunes, Nuno J. Mamede, Cláudia Oliveira, Maria Carmelita Dias (eds.)

Since 1993, PROPOR Workshops became a huge discussion board for - searchers excited about the Computational Processing of Portuguese,both written and spoken. This PROPOR Workshop follows earlier workshops held in 1993 (Lisbon, Portugal), 1996 (Curitiba, Brazil), 1998 (Porto Alegre, Brazil), 1999 ´ (Evora, Portugal), 2000 (Atibaia, Brazil) and 2003 (Faro, Portugal). The wo- store has more and more contributed to bringing jointly researchers and companions from either side of the Atlantic. The structure of a world software Committee and the adoption of high-standard referee techniques display the regular improvement of the ?eld and of its scienti?c neighborhood. In 2006 PROPOR obtained fifty six paper submissions from eleven di?erent nations: Brazil, Portugal, Spain, Norway, united states, Italy, Japan, France, Canada, Denmark and the united kingdom, from which nine are represented within the authorised papers. every one submitted paper underwent a cautious, triple-blind overview via the P- gram Committee. All those that contributed are pointed out within the following pages. The reviewing method ended in the choice of 20 general papers for oral presentation and 17 brief papers for poster sections, that are released during this quantity. The workshop and this publication have been dependent round the following major t- ics, seven for complete papers: (i) automated summarization; (ii) assets; (iii) au- matic translation; (iv) named entity acceptance; (v) instruments and frameworks; (vi) platforms and versions; and one other ?ve issues for brief papers; (vii) info extraction; (viii) speech processing; (ix) lexicon; (x) morpho-syntactic reviews; (xi) internet, corpus and evaluation.

Example text

We call this strategy “natural alignment”. As the detection of these long and natural segments is a trivial task that does not need further corrections, the extraction method we will describe is entirely unsupervised. This work has been supported by Ministerio de Educaci´ on y Ciencia of Spain, within the projects CESAR+ and GaricoTerm, ref: BFF2003-02866. R. Vieira et al. ): PROPOR 2006, LNAI 3960, pp. 41–49, 2006. c Springer-Verlag Berlin Heidelberg 2006 42 P. Gamallo Otero To achieve the same accuracy as comparable approaches to bilingual lexicon extraction, the correlation metric we propose in this paper will be well-suited to account for longer aligned segments than sentences.

The following is a small sample of the evaluation corpus (anthroponyms and other proper names have been highlighted in bold)3: Jornal Dois, a informação com #Manuel #Menezes. Boa noite. A Comissão Europeia decidiu pedir a ^Portugal que explique alguns aspectos do traçado da auto-estrada do ^Algarve. Em causa está o projectado troco da ~A dois, que atravessa a zona de protecção especial de ^Castro ^Verde, e que poderá constituir uma violação da directiva comunitária sobre protecção das aves selvagens.

On Language Resources and Evaluation, Las Palmas, Canary Islands, Spain (2002). 6. : Definition, dictionaries and tagger for Extended Named Entity Hierarchy. In: Proc. 4th Int. Conf. on Language Resources and Evaluation, Lisboa, Portugal, (2004) pp. 1977-1980. 7. : Acquisition of categorized named entities for web search. In: Proc. 13th ACM Conf. on Information & Knowledge management. , USA (2004) 137145. 8. : BACO – A large database of text and co-occurrences. In: Proc. 5th Int. Conf, on Language Resources and Evaluation, Genoa, Italy (2006).

