Friday, June 11, 2010

Text Data Mining/Visualization - Did I really say it that much!?!

Was poking around looking at text data-collection and analysis tools since a lot of our data is not in standard database or data file formats but in text. I ran across Wordle, an on-line text mining / analytic tool that generates "word clouds" from text . The clouds give greater visual prominence to words that appear more frequently in the source text. To create your own word cloud simply copy and paste your text into the tool or point your website/blog to Wordle via the same tool and it will formulate the word cloud.
 
Often, data mining/visualization will show unexpected patterns in the data that enable you to sometimes verify, or at least postulate, the cause of what the data displays. Ideally, the analysis will assist you in predicting future customer buying habits, upcoming income/expenditure levels, etc.
 
Anyway, pointing my blog to Wordle produced the image below. In looking at the word cloud, I know I have been discussing Ubuntu a good deal lately but I was not aware to what degree. Again, data mining/visualization shows that I need to move on to other topics in my blog!
 

Posted via email from Mark's Musings

No comments: