Friday, January 14, 2011

Ameture Sergan Xmas Edition Iphone Walkthrough



Presentation of 2 TaLTaC


http://www.taltac.it/it/index.shtml http://www.taltac.it/it/index.shtml

TALTAC is Processing for Lexical and Textual Analysis of the Content.
Taltac:
  • is a software for the automated analysis of the text in the double logic Text Analysis (TA) and Text Mining (TM). This analysis allows to give representations of the phenomenon is studied on a quantitative basis in terms of units of text (words) at the level of context units (fragments / documents), then as language used and how content covered in the text . To approach this is possible without physically read the text collection and then regardless of the size of the corpus, which can be huge (millions of words).
  • has originated from the results of research undertaken at the University University of Salerno and Rome "La Sapienza" during the nineties, coordinated by Sergio Bolasco , Professor of Statistics at the Department of geo-economic studies, linguistic, statistical and historical analysis of regional wisdom and is the result of collaboration of researchers and colleagues from various Italian and French universities. (Credits )
  • resources using both statistical both type language, highly integrated with each other and user-customizable, and allows two levels, vocabulary and text on the one hand analysis of the text (text analysis), other recovery and extraction information, according to the principles of data mining and text mining.
With version 2.0 of the program, released in November 2005, the acronym has been enriched by a second C - TaLTaC2 - would like to stress that further research purposes: the analysis of the Corpus as such, or the study of some of its characteristics, regardless of the content .
The automatic processing, according to a lessicometrico approach, makes it possible to discover some constants a text, a sort of DNA in the corpus.
At the end of 2007, the software is present in Italy in more than 50 academic departments, more than 20 research centers and institutions national interest and in some foreign universities.

TaLTaC 2 consists of a set of tools that enable the study of any linguistic data collected in the form of text collection as a single corpus, using the techniques of "textual statistics " (*) . This approach allows the study of unstructured information found in a documentary basis for large-scale (hundreds or thousands of pages, or even a 100 MB file), together with structured information (qualitative or quantitative variables) contained in a database associated with it.

TaLTaC 2 is prepared is that in the input to output for the use of other software of text analysis and text mining, particularly those typical of the approach lessicometrico, which Alceste, Hyperbase, Lexico, Spad, Sphinx, T-Lab, Tropes. ( Links )

In general, the analysis in Taltac 2 can select and extract information at the from the corpus of texts analyzed ( peculiar language, language relevant, language specific ) and operate under the principles of text mining by searching for keywords or concepts . The results obtained in

Taltac can interact directly with other language software (Tree Tagger, Nooj-Intex, Lexical Studio) and statistical (Spad, SPSS, SAS).

knowledge of the program is facilitated by consulting a guide online, hypertext-like, with consultation "in context" (to help you position on the subject being treated).

From the second release are regularly organized training courses for full use of the program. These courses are matched to the issuance of a license. To enroll, simply fill out the form on the website of the Department DSGSSAR of SAPIENZA University of Rome

0 comments:

Post a Comment