CarCloud: I did a sentiment analysis of the Tata Harrier threads

spgv · 27th July 2020, 01:25

Picture speaks a thousand words, literally in this case.
The government may also use similar analysis for its policies or politics.

A few questions from my side.

1. Are the words appearing in the same sentence juxtaposed. Like, speaking of features, the commentators may say they liked the touch screen interface, sound quality, lounge lighting etc. so that word features and sound quality are juxtaposed along with other keywords in such sentence. I think that would make more sense.

2.Colours have a direct impact on way we make sense of scene for example yellow or red on a background of lighter shades will immediately catch attention. In this case, impact the weightage of a word.
A suggestion: Words pertaining to engine, performance reliability etc may be localised, if not confined to the front of the car and so on for other entities.

Will be waiting for similar analysis on other car models, and threads as well. Thanks for the excellent post, keep up the good work.

ihrishi · 27th July 2020, 19:05

Quote:

Originally Posted by Shubhendra

Good initiative and a great analysis.
One suggestion, if you can refine your list of words by removing few more non-relevant word, it will be a fantastic analysis.
I am sure you might have done sentiment analysis of a current topic on twitter through a package twitteR in R programming, its an awesome timepass.

Thank you.
Removing non-relevant words is easily done. Any words you think can be removed, without taking too much information out? Mind you, these words will be common across all threads for a more like to like comparison.
This model is going to be applied across all major threads eventually.

Quote:

Originally Posted by ACM

You could update this to CURRENT Sentiment, by considering only the period post the update. Much smaller sample but much more relevant maybe. Then Steering, NVH, Sunroof may just be significant in 1 section.

That makes sense. I can give this a go. I have it on my list of things to do.

Quote:

Originally Posted by AZ911

The IT way to look at things.

It would be great to compare other cars in the same segment and see what the word cloud comes up with. Then we can do a cloud vs cloud comparison.
I'd love to see the cloud for the ever finicky XUV500.

Bingo - that IS the plan!

Quote:

Originally Posted by Zinda

I just loved this idea. Outcome is impressive too.
Are you using Python's built in library for Logistic Regression model (I believe that's the algorithm in play here)?

Yes - using Python's built in library with Google NLP API for the sentiment analysis. All this is basic data analysis, which technically can be done in excel too. No regression or ML model used yet. That will come later

Quote:

Originally Posted by spgv

Picture speaks a thousand words, literally in this case. The government may also use similar analysis for its policies or politics.

A few questions from my side.
1. Are the words appearing in the same sentence juxtaposed. Like, speaking of features, the commentators may say they liked the touch screen interface, sound quality, lounge lighting etc. so that word features and sound quality are juxtaposed along with other keywords in such sentence. I think that would make more sense.
2.Colours have a direct impact on way we make sense of scene for example yellow or red on a background of lighter shades will immediately catch attention. In this case, impact the weightage of a word.
A suggestion: Words pertaining to engine, performance reliability etc may be localised, if not confined to the front of the car and so on for other entities.
Will be waiting for similar analysis on other car models, and threads as well. Thanks for the excellent post, keep up the good work.

Thank you for your kind words.
1 - I don't think so. What I have done is; picked up all the 'entities' in the comments, filtered a few common or irrelevant ones and then the word cloud is generated based on the frequency of the individual words.
2 - Colors - I am using the default palette. Conditional coloring of the words may not be a feature of the library i am using. Will need to explore
3 - Can you elaborate your suggestion? I could not understand it enough.

GrandTourer · 27th July 2020, 19:39

This is really good stuff Rishi.

Quote:

Originally Posted by ihrishi

NEGATIVE ENTITIES - by frequency:

Please let me know if anyone would like to see any other insights!

I see a few words like problem, issue and lack are coming up in high frequencies which you'd expect in the negative set. Why don't you take a look at TF-IDF, which will help you get rid of these low value terms. There are libraries for that in python readily available.

Those are my 2 cents. Great thread!

27th July 2020, 01:25	#31
spgv BHPian Join Date: Jun 2016 Location: Mumbai Posts: 94 Thanked: 100 Times View My Garage	Re: CarCloud: I did a sentiment analysis of the Tata Harrier threads Picture speaks a thousand words, literally in this case. The government may also use similar analysis for its policies or politics. A few questions from my side. 1. Are the words appearing in the same sentence juxtaposed. Like, speaking of features, the commentators may say they liked the touch screen interface, sound quality, lounge lighting etc. so that word features and sound quality are juxtaposed along with other keywords in such sentence. I think that would make more sense. 2.Colours have a direct impact on way we make sense of scene for example yellow or red on a background of lighter shades will immediately catch attention. In this case, impact the weightage of a word. A suggestion: Words pertaining to engine, performance reliability etc may be localised, if not confined to the front of the car and so on for other entities. Will be waiting for similar analysis on other car models, and threads as well. Thanks for the excellent post, keep up the good work.
	() Thanks