prevent sentence responses in word cloud OR split a string of words before frequency analysis | Voters

prevent sentence responses in word cloud OR split a string of words before frequency analysis

Adam

the value of a wordcloud is that it clearly presents the frequency of a word supplied from multiple voters. The value diminishes when that 'common' word is provided as response containing a string of words, because it's unlikely that all voters will provide the exact same string (i.e. sentence).
eg: I ask "what are the characteristics of a good leader?" and I get
"fair"
"fair!"
"should be fair"
Because more than one word (and grammar) can be submitted as an answer, I don't get a big "fair" in my cloud - I get these 3 answers appearing very small. Which means I need to analyse this in situ. Which rather defeats both the beauty and the purpose of a word cloud!
SO - and I accept we can't create a system that anticipates and handles the wealth of variance that we provide in language as individuals - we need some coding:
1.IF string (the answer) contains multiple words (a sentence)
2.SPLIT string into component words (we now have one "answer" per word")
Analyse word frequencies based on split strings (ignoring common key words eg "and" and punctuation)
and as an afterthought, users can hyphenate for necessary strings like "decision-making", "level-headed"

April 27, 2018