Explorations in Automatic Thesaurus Discovery PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Explorations in Automatic Thesaurus Discovery PDF full book. Access full book title Explorations in Automatic Thesaurus Discovery by Gregory Grefenstette. Download full books in PDF and EPUB format.

Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery PDF Author: Gregory Grefenstette
Publisher: Springer Science & Business Media
ISBN: 1461527104
Category : Computers
Languages : en
Pages : 313

Book Description
Explorations in Automatic Thesaurus Discovery presents an automated method for creating a first-draft thesaurus from raw text. It describes natural processing steps of tokenization, surface syntactic analysis, and syntactic attribute extraction. From these attributes, word and term similarity is calculated and a thesaurus is created showing important common terms and their relation to each other, common verb--noun pairings, common expressions, and word family members. The techniques are tested on twenty different corpora ranging from baseball newsgroups, assassination archives, medical X-ray reports, abstracts on AIDS, to encyclopedia articles on animals, even on the text of the book itself. The corpora range from 40,000 to 6 million characters of text, and results are presented for each in the Appendix. The methods described in the book have undergone extensive evaluation. Their time and space complexity are shown to be modest. The results are shown to converge to a stable state as the corpus grows. The similarities calculated are compared to those produced by psychological testing. A method of evaluation using Artificial Synonyms is tested. Gold Standards evaluation show that techniques significantly outperform non-linguistic-based techniques for the most important words in corpora. Explorations in Automatic Thesaurus Discovery includes applications to the fields of information retrieval using established testbeds, existing thesaural enrichment, semantic analysis. Also included are applications showing how to create, implement, and test a first-draft thesaurus.

Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery PDF Author: Gregory Grefenstette
Publisher: Springer Science & Business Media
ISBN: 1461527104
Category : Computers
Languages : en
Pages : 313

Book Description
Explorations in Automatic Thesaurus Discovery presents an automated method for creating a first-draft thesaurus from raw text. It describes natural processing steps of tokenization, surface syntactic analysis, and syntactic attribute extraction. From these attributes, word and term similarity is calculated and a thesaurus is created showing important common terms and their relation to each other, common verb--noun pairings, common expressions, and word family members. The techniques are tested on twenty different corpora ranging from baseball newsgroups, assassination archives, medical X-ray reports, abstracts on AIDS, to encyclopedia articles on animals, even on the text of the book itself. The corpora range from 40,000 to 6 million characters of text, and results are presented for each in the Appendix. The methods described in the book have undergone extensive evaluation. Their time and space complexity are shown to be modest. The results are shown to converge to a stable state as the corpus grows. The similarities calculated are compared to those produced by psychological testing. A method of evaluation using Artificial Synonyms is tested. Gold Standards evaluation show that techniques significantly outperform non-linguistic-based techniques for the most important words in corpora. Explorations in Automatic Thesaurus Discovery includes applications to the fields of information retrieval using established testbeds, existing thesaural enrichment, semantic analysis. Also included are applications showing how to create, implement, and test a first-draft thesaurus.

Survey of Text Mining

Survey of Text Mining PDF Author: Michael W. Berry
Publisher: Springer Science & Business Media
ISBN: 147574305X
Category : Computers
Languages : en
Pages : 251

Book Description
Extracting content from text continues to be an important research problem for information processing and management. Approaches to capture the semantics of text-based document collections may be based on Bayesian models, probability theory, vector space models, statistical models, or even graph theory. As the volume of digitized textual media continues to grow, so does the need for designing robust, scalable indexing and search strategies (software) to meet a variety of user needs. Knowledge extraction or creation from text requires systematic yet reliable processing that can be codified and adapted for changing needs and environments. This book will draw upon experts in both academia and industry to recommend practical approaches to the purification, indexing, and mining of textual information. It will address document identification, clustering and categorizing documents, cleaning text, and visualizing semantic models of text.

ECAI 2006

ECAI 2006 PDF Author: G. Brewka
Publisher: IOS Press
ISBN: 1607501899
Category : Computers
Languages : en
Pages : 892

Book Description
In the summer of 1956, John McCarthy organized the famous Dartmouth Conference which is now commonly viewed as the founding event for the field of Artificial Intelligence. During the last 50 years, AI has seen a tremendous development and is now a well-established scientific discipline all over the world. Also in Europe AI is in excellent shape, as witnessed by the large number of high quality papers in this publication. In comparison with ECAI 2004, there’s a strong increase in the relative number of submissions from Distributed AI / Agents and Cognitive Modelling. Knowledge Representation & Reasoning is traditionally strong in Europe and remains the biggest area of ECAI-06. One reason the figures for Case-Based Reasoning are rather low is that much of the high quality work in this area has found its way into prestigious applications and is thus represented under the heading of PAIS.

Spotting and Discovering Terms Through Natural Language Processing

Spotting and Discovering Terms Through Natural Language Processing PDF Author: Christian Jacquemin
Publisher: MIT Press
ISBN: 9780262100854
Category : Computers
Languages : en
Pages : 406

Book Description
The acquired parsed terms can then be applied for precise retrieval and assembly of information."--BOOK JACKET.

Computational Linguistics and Intelligent Text Processing

Computational Linguistics and Intelligent Text Processing PDF Author: Alexander Gelbukh
Publisher: Springer
ISBN: 3540364560
Category : Language Arts & Disciplines
Languages : en
Pages : 652

Book Description
CICLing 2003 (www.CICLing.org) was the 4th annual Conference on Intelligent Text Processing and Computational Linguistics. It was intended to provide a balanced view of the cutting-edge developments in both the theoretical foundations of computational linguistics and the practice of natural language text processing with its numerous applications. A feature of CICLing conferences is their wide scope that covers nearly all areas of computational linguistics and all aspects of natural language processing applications. The conference is a forum for dialogue between the specialists working in these two areas. This year we were honored by the presence of our keynote speakers Eric Brill (Microsoft Research, USA), Aravind Joshi (U. Pennsylvania, USA), Adam Kilgarriff (Brighton U., UK), and Ted Pedersen (U. Minnesota, USA), who delivered excellent extended lectures and organized vivid discussions. Of 92 submissions received, after careful reviewing 67 were selected for presentation; 43 as full papers and 24 as short papers, by 150 authors from 23 countries: Spain (23 authors), China (20), USA (16), Mexico (13), Japan (12), UK (11), Czech Republic (8), Korea and Sweden (7 each), Canada and Ireland (5 each), Hungary (4), Brazil (3), Belgium, Germany, Italy, Romania, Russia and Tunisia (2 each), Cuba, Denmark, Finland and France (1 each).

Natural Language Processing – IJCNLP 2004

Natural Language Processing – IJCNLP 2004 PDF Author: Keh-Yih Su
Publisher: Springer Science & Business Media
ISBN: 3540244751
Category : Computers
Languages : en
Pages : 827

Book Description
This book constitutes the thoroughly refereed post-proceedings of the First International Joint Conference on Natural Language Processing, IJCNLP 2004, held in Hainan Island, China in March 2004. The 84 revised full papers presented in this volume were carefully selected during two rounds of reviewing and improvement from 211 papers submitted. The papers are organized in topical sections on dialogue and discourse; FSA and parsing algorithms; information extractions and question answering; information retrieval; lexical semantics, ontologies, and linguistic resources; machine translation and multilinguality; NLP software and applications, semantic disambiguities; statistical models and machine learning; taggers, chunkers, and shallow parsers; text and sentence generation; text mining; theories and formalisms for morphology, syntax, and semantics; word segmentation; NLP in mobile information retrieval and user interfaces; and text mining in bioinformatics.

Encyclopedia of Information Science and Technology, First Edition

Encyclopedia of Information Science and Technology, First Edition PDF Author: Khosrow-Pour, D.B.A., Mehdi
Publisher: IGI Global
ISBN: 159140794X
Category : Education
Languages : en
Pages : 3807

Book Description
Comprehensive coverage of critical issues related to information science and technology.

The Role of Digital Libraries in a Time of Global Change

The Role of Digital Libraries in a Time of Global Change PDF Author: Gobinda Chowdhury
Publisher: Springer
ISBN: 3642136540
Category : Computers
Languages : en
Pages : 270

Book Description
The year 2010 was a landmark in the history of digital libraries because for the first time this year the ACM/IEEE Joint Conference on Digital Libraries (JCDL) and the annual International Conference on Asia-Pacific Digital Libraries (ICADL) were held together at the Gold Coast in Australia. The combined conferences provided an - portunity for digital library researchers, academics and professionals from across the globe to meet in a single forum to disseminate, discuss, and share their valuable - search. For the past 12 years ICADL has remained a major forum for digital library - searchers and professionals from around the world in general, and for the Asia-Pacific region in particular. Research and development activities in digital libraries that began almost two decades ago have gone through some distinct phases: digital libraries have evolved from mere networked collections of digital objects to robust information services designed for both specific applications as well as global audiences. Con- quently, researchers have focused on various challenges ranging from technical issues such as networked infrastructure and the creation and management of complex digital objects to user-centric issues such as usability, impact and evaluation. Simulta- ously, digital preservation has emerged and remained as a major area of influence for digital library research. Research in digital libraries has also been influenced by s- eral socio-economic and legal issues such as the digital divide, intellectual property, sustainability and business models, and so on. More recently, Web 2.

Progress in Artificial Intelligence

Progress in Artificial Intelligence PDF Author: Luís Seabra Lopes
Publisher: Springer
ISBN: 364204686X
Category : Computers
Languages : en
Pages : 690

Book Description
This book contains a selection of higher quality and reviewed papers of the 14th Portuguese Conference on Artificial Intelligence, EPIA 2009, held in Aveiro, Portugal, in October 2009. The 55 revised full papers presented were carefully reviewed and selected from a total of 163 submissions. The papers are organized in topical sections on artificial intelligence in transportation and urban mobility (AITUM), artificial life and evolutionary algorithms (ALEA), computational methods in bioinformatics and systems biology (CMBSB), computational logic with applications (COLA), emotional and affective computing (EAC), general artificial intelligence (GAI), intelligent robotics (IROBOT), knowledge discovery and business intelligence (KDBI), muli-agent systems (MASTA) social simulation and modelling (SSM), text mining and application (TEMA) as well as web and network intelligence (WNI).

Computational Processing of the Portuguese Language

Computational Processing of the Portuguese Language PDF Author: Nuno J. Mamede
Publisher: Springer Science & Business Media
ISBN: 3540404368
Category : Education
Languages : en
Pages : 282

Book Description
The refereed proceedings of the 6th International Workshop on Computational Processing of the Portuguese Language, PROPOR 2003, held in Faro, Portugal, in June 2003. The 24 revised full papers and 17 revised short papers presented were carefully reviewed and selected from 64 submissions. The papers are organized in topical sections on speech analysis and recognition; speech synthesis; pragmatics, discourse, semantics, syntax, and the lexicon; tools, resources, and applications; dialogue systems; summarization and information extraction; and evaluation.