Torrens University Australia
Question
Asked 27th Dec, 2014
Can you recommend a Free Text Mining tool?
I would like to know what free online text mining tools I can use for user profile?
Can mining of user profile be applied online?
Most recent answer
You are going to want to look at Orange - a well-built, open-source and free tool
1 Recommendation
Popular answers (1)
Rajiv Gandhi Institute of Technology, Bangalore
Dear friend
I got a few more to share with you.
here is the list.
Carrot2 – text and search results clustering framework.
GATE – General Architecture for Text Engineering, an open-source toolbox for natural language processing and language engineering
Gensim - large-scale topic modelling and extraction of semantic information from unstructured text (Python)
OpenNLP - natural language processing
Natural Language Toolkit (NLTK) – a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python programming language.
RapidMiner with its Text Processing Extension – data and text mining software.
Unstructured Information Management Architecture (UIMA) – a component framework to analyze unstructured content such as text, audio and video, originally developed by IBM.
The programming language R provides a framework for text mining applications in the package tm.[4] The Natural Language Processing task view contains tm and other text mining library packages.[5]
The KNIME Text Processing extension.
KH Coder - For content analysis, text mining or corpus linguistics.
The PLOS Text Mining Collection[6]
11 Recommendations
All Answers (23)
Rajiv Gandhi Institute of Technology, Bangalore
Dear friend
Greetings.
Data Mining Lists - Free Text Mining Tools
Text mining concerns itself with discovering structure and patterns in unstructured data – usually text. There are many different approaches to this task, some focus on ancillary structures such as taxonomies and ontologies, some focus on semantics and natural language processing, while others use various algorithms to categorise and summarise. It all depends on need as to which will be the most appropriate.
GATE (General Architecture for Text Engineering)
This is a large full-lifecycle open source text mining software suite with several components:
* GATE Developer is an integrated environment consisting of language processing components which incorporate the widely used Information Extraction system along with other plugins.
* GATE Teamware provides a collaborative environment for document annotation. This is built around a workflow paradigm.
* GATE Embedded is a Java object library to provide an interafec to other applications within the organisation.
KNIME Text processing is a plug-in to the (free) KNIME data mining suite. It supports a six step text processing process which starts with the reading and parsing of text, followed by named entity recognition, filtering and manipulation, word counting and keyword extraction, bow and vector representation, and finally visualisation.
LPU (learning from Positive and Unlabeled Examples)
This is a text learning and classification system that utilises support vector machines (SVM) and EM (Expectation Maximisation) techniques. Runs in a DOS window.
Orange-Text
This is an add-in to the free Orange data mining suite. It operates within the visual analytics tools provided in Orange and adds the ability to process unstructured data.
RapidMiner Text Extension
This provides operators for the RapidMiner environment for statistical text analysis. Many data sources are supported including plain text, HTML and pdf. A large number of filtering techniques are supported and support for tokenization, stemming, stopword filtering and n-gram generation. This is all embraced within the graphical interface provided by RapidMiner (which is a free data mining suite) and many tasks can be completed through drag and drop functionality.
Other products of interest include:
Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also includes maximum entropy and perceptron based machine learning.
Apache Mahout supports recommendation mining taking users’ behavior and from that tries to find items users might like. Clustering takes e.g. text documents and groups them into groups of topically related documents. Classification learns from existing categorized documents what documents of a specific category look like and is able to assign unlabelled documents to the (hopefully) correct category.
Text Analytics: A Business Guide.tabgw
A report for business and technology managers wishing to understand the impact of rapidly evolving text analytics capabilities, and their application in business.
I hope this helps you
Best regards
Dr.Indrajit Mandal ,Ph.D.
2 Recommendations
Rajiv Gandhi Institute of Technology, Bangalore
Dear friend
I got a few more to share with you.
here is the list.
Carrot2 – text and search results clustering framework.
GATE – General Architecture for Text Engineering, an open-source toolbox for natural language processing and language engineering
Gensim - large-scale topic modelling and extraction of semantic information from unstructured text (Python)
OpenNLP - natural language processing
Natural Language Toolkit (NLTK) – a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python programming language.
RapidMiner with its Text Processing Extension – data and text mining software.
Unstructured Information Management Architecture (UIMA) – a component framework to analyze unstructured content such as text, audio and video, originally developed by IBM.
The programming language R provides a framework for text mining applications in the package tm.[4] The Natural Language Processing task view contains tm and other text mining library packages.[5]
The KNIME Text Processing extension.
KH Coder - For content analysis, text mining or corpus linguistics.
The PLOS Text Mining Collection[6]
11 Recommendations
Federal University of ABC
I've been working with NLTK for applications of Corpus Linguistics in Portuguese.
1 Recommendation
Annamalai University
I've worked with Weka, rapidminer, R and Knime for text mining.
Rapidminer has additional plugin for txt mining.
1 Recommendation
University of Padova
I agree with Hassan, I'd suggest R and in particular the package tm (in my opinion, not extremely efficient but very easy to study text mining approaches on small/medium scale collections).
1 Recommendation
Dear Nouran Radwan;
My experience:
Carrot2
I use in my sceintific research studies and papers.
It helps me a lot. It is free and very helpfull for the researchers.
Textalyser
It also helps the researchers. Time limited online.
Good luck with your research
Best Regards
1 Recommendation
Dear Nouran Radwan;
For my new scientific paper study, I installed the text mining packages for R.
In my new study I will work on R. It looks not so diffucult.
You can visit the webpages such as http://www.r-project.org/ for R, http://www.rstudio.com/ for R Studio and their packages such as tm http://127.0.0.1:33888/help/library/tm/doc/extensions.pdf, mlflex.RLearner that interfaces with the R statistical package.
Good Luck With Your Research
1 Recommendation
National Research University Higher School of Economics
Hello,
I specialize in public data gathering (web harvesting) from open access websites by programming a web-crawler. The data can later be used for statistical or content analysis. For example, my recent collection was data from booking.com and tripadvisor.com with information about reviews, ratings and prices along with the accompanying data such as geographical region, addresses, and many more. The data comes out in a form that is easily converted to SPSS or Excel format.
Technically, any website or social network can be a source of data. Please feel free to contact me if you are interested as I am open for research collaboration.
Sincerely,
Evgeny
Dear Nouran;
I tried TextSTAT - Simple Text Analysis Tool http://neon.niederlandistik.fu-berlin.de/en/textstat/.
It is very easy to use for simple text analysis. I like it for simple text analysis.
Success in your research.
1 Recommendation
HEC Montréal - École des Hautes Études commerciales
You can use WordStat Text Mining Tool for free: http://provalisresearch.com/products/content-analysis-software/
3 Recommendations
NelSenso.Net
You can try nelSenso.net Text Mining Tool for free: http://www.nelsenso.net
Texifter
The Coding Analysis Toolkit (CAT) is free, open source, web-based, collaborative text analysis tool.
http://cat.texifter.com
The key features are related to the measurement of inter-rater reliability and the adjudication of annotator disagreements. I have attached a peer-reviewed scholarly paper on the software itself as well as the general question of using software for text analysis (I am the co-author).
What can you do with CAT?
- Efficiently code raw text data sets
- Annotate coding with shared memos
- Manage team coding permissions via the Web
- Create unlimited collaborator sub-accounts
- Assign multiple coders to specific tasks
- Easily measure inter-rater reliability
- Adjudicate valid & invalid coder decisions
- Report validity by dataset, code or coder
- Export coding in RTF, CSV or XML format
- Archive or share completed projects
1 Recommendation
University of Bonn
Hi Nouran Radwan, It really depends on what you are trying to accomplish with the user profile text mining. I usually work with interview data and rely on the R language (as Nikos Koutsoupias suggested RQDA is one of the option here). I mostly stick with tidy tools. The 'Text Mining with R' book<https://www.tidytextmining.com/index.html> walks through the most common text mining operations in that style. The authors have a number of other related projects that are worth a look:
Julia Silge<https//juliasilge.com/>
David Robinson<http://varianceexplained.org/>
1 Recommendation
Similar questions and discussions
What is InTech Open Science? A predatory or a ligitimate publisher?
Stephen Jia Wang
Dear friends colleagues, have you ever received an invitation to publish your work at InTech Open Science (https://www.intechopen.com/)? I have recently been invited to edit a new book title for them. I am usually suspicious with such invitations and must check the authenticity of the publisher first. Interestingly, they claim that they have published the work for two recent Nobel Laureates. Therefore, I would appreciate your experience and opinions regarding InTech Open Science.
Kind regards,
Is this a new scam or something reliable?
Jean-Claude Grivel
Dear colleagues,
Today I received an email from "awards@scienceconnect.online" with the following text:
Dear Author,
Congratulations, Your recent publication has been provisionally selected for Research Awards and recommended by our scientific committee. So kindly nominate with your recent research profile/resume through an online submission system. After a few steps of profile verification and registration processes you will get your Research Award.
Selected Award Category: Best Research Award
Note: Submit your updated profile under the selected award category.
With regards,
The Award Manager,
Research Awards
An International Research Awards
ScienceFather
There was also a nomination link, which I removed from the text above and that I did not try to open, as well as the abstract of a paper that was accepted a few days ago and made available online about 10 hours before that mail was being sent. According to the website (https://sciencefather.com/awards/), one needs to pay a registration fee in order to apply for an award. Is this a new scam? Thanks for your help.
Related Publications
This paper presents ongoing work on application of Information Extraction (IE) technology to domain of Public Health, in a real-world scenario. A central issue in IE is the quality of the results. We present two novel points. First, we distinguish the criteria for quality: the objective criteria that measure correctness of the system's analysis in...
The motives are, so to speak, obsessions with activity which tend to give rise to leitmotifs. An individual conducts his activity in the same direction as his motives, and produces information in the same conceptual context, which should be found in the form of leitmotifs. Leitmotifs are expressed in a digital document or between several documents...
Information security has evolved from just focusing on the network and server layers to also include the web application layer. In fact, security in some types of web applications is often considered a particularly sensitive subject. Achieving a secure web application involves several different issues like encrypting traffic and certain database in...





























