Sentiment Analysis



Sentiment Analysis

Methods, Tools & Python Code — For Engineering Students

Last Updated: March 2026

📌 Key Takeaways

  • Definition: Sentiment analysis automatically identifies the emotional tone (positive, negative, neutral) expressed in text.
  • Three approaches: Lexicon-based (VADER, TextBlob), ML-based (Naive Bayes, Logistic Regression + TF-IDF), Deep Learning (BERT).
  • For social media text: Use VADER — specifically designed for short, informal text with emojis and slang.
  • For formal reviews with labelled data: Train Logistic Regression + TF-IDF — fast and accurate.
  • For highest accuracy: Fine-tune BERT — state-of-the-art but requires more compute.
  • Key challenges: Sarcasm, negation, domain-specificity, multilingual text.

1. What is Sentiment Analysis?

Sentiment analysis (also called opinion mining) is the use of NLP and machine learning to automatically determine the emotional tone or opinion expressed in a piece of text.

The most common task is polarity classification — labelling text as positive, negative, or neutral. More granular tasks include emotion detection (joy, anger, fear, surprise), aspect-based sentiment (what is positive/negative about a specific feature), and intensity scoring (how strongly positive or negative).

Sentiment analysis is one of the most commercially important NLP applications — companies process millions of customer reviews, social media mentions, and support tickets daily to understand public opinion at scale.

2. Types of Sentiment Analysis

TypeOutputExample
Polarity classificationPositive / Negative / Neutral“Great product!” → Positive
Fine-grained5-point scale (Very negative to Very positive)Star rating prediction
Emotion detectionJoy, Anger, Fear, Surprise, Sadness, Disgust“I can’t believe this happened!” → Surprise/Anger
Aspect-basedSentiment per product feature“Battery is great but screen is dim” → Battery:+, Screen:−
Subjectivity detectionSubjective vs Objective“The phone weighs 180g” → Objective

3. Approach 1 — Lexicon-Based

Lexicon-based methods use predefined dictionaries (sentiment lexicons) that assign polarity scores to words. The sentiment of a text is computed by aggregating the scores of its constituent words.

VADER (Valence Aware Dictionary and sEntiment Reasoner)

VADER is the most widely used lexicon-based tool for social media text. It is specifically designed for short, informal text — tweets, reviews, comments — and handles capitalisation (“GREAT” > “great”), punctuation (“great!!!” > “great”), emojis (😊 contributes positive), slang, and degree modifiers (“extremely bad” vs “slightly bad”).

VADER outputs four scores: positive, negative, neutral (proportions), and a compound score from −1 (most negative) to +1 (most positive). A common threshold: compound ≥ 0.05 → positive; compound ≤ −0.05 → negative; otherwise neutral.

TextBlob

TextBlob provides two scores: polarity (−1 to +1) and subjectivity (0=objective to 1=subjective). It is simpler than VADER and works well for formal English text. Less effective for social media slang and emojis.

When to use lexicon-based: No labelled training data available; real-time processing needed; interpretability is important; quick prototyping.

4. Approach 2 — Machine Learning-Based

ML-based sentiment analysis trains a classifier on labelled sentiment data using text features (TF-IDF, word embeddings). This requires a labelled dataset (reviews with star ratings, manually labelled tweets, etc.) but typically outperforms lexicon methods on domain-specific text.

Standard Pipeline:

  1. Preprocess text (lowercase, remove noise, tokenise)
  2. Extract features: TF-IDF vectors (most common for ML models)
  3. Train classifier: Logistic Regression (best baseline), Naive Bayes (fast), SVM (high accuracy)
  4. Evaluate on test set: accuracy, F1 score, confusion matrix

Logistic Regression with TF-IDF features is the standard baseline — it achieves 85–90% accuracy on clean datasets like IMDb movie reviews and is fast, interpretable, and easy to deploy.

5. Approach 3 — Deep Learning (BERT)

BERT (Bidirectional Encoder Representations from Transformers) achieves state-of-the-art sentiment analysis by fine-tuning a pre-trained language model on sentiment-labelled data. BERT’s bidirectional attention captures complex linguistic patterns — negation, sarcasm, contextual meaning — that TF-IDF cannot represent.

Fine-tuning BERT for sentiment involves: loading a pre-trained BERT model, adding a classification layer on top, and training on labelled sentiment data for a few epochs. Libraries like HuggingFace Transformers make this straightforward.

BERT typically achieves 92–95% accuracy on standard benchmarks like SST-2 and IMDb, compared to 85–90% for TF-IDF + Logistic Regression. The tradeoff: BERT is 100x larger and slower — use it when accuracy is critical and compute is available.

6. Approach Comparison

FeatureLexicon (VADER)ML (LR + TF-IDF)Deep Learning (BERT)
Training data neededNoneYes (labelled)Yes (or pre-trained)
Accuracy (typical)70–80%85–90%92–95%
SpeedVery fastFastSlow (GPU recommended)
Social media textExcellent (VADER)GoodExcellent
Sarcasm/NegationPoorModerateGood
InterpretabilityHighModerateLow
Best forQuick analysis, no dataProduction with labelled dataMaximum accuracy

7. Key Challenges

  • Negation: “This is not good” — “not” flips the sentiment of “good”. Simple lexicon models miss this. ML models learn it partially; BERT handles it well.
  • Sarcasm: “Oh great, another delay” — surface words are positive but sentiment is negative. Extremely hard to detect without context and world knowledge.
  • Domain specificity: “This movie is sick!” (positive slang) vs “The patient is sick” (medical, negative). Domain-specific training data is essential.
  • Multilingual text: Code-switching (mixing Hindi and English in one sentence) is common in Indian social media — standard models fail. Use multilingual BERT (mBERT) or XLM-RoBERTa.
  • Aspect ambiguity: “The food was great but the service was terrible” — overall sentiment is mixed, but aspect-based analysis reveals more nuanced insights.

8. Real-World Applications

DomainApplication
E-commerceAnalysing product reviews to identify issues and highlight positives
Social MediaBrand monitoring — tracking public opinion about companies and products
FinanceAnalysing news headlines and earnings call transcripts to predict stock movements
HealthcareAnalysing patient feedback and clinical notes for quality improvement
PoliticsTracking public opinion on policies and politicians from social media
Customer ServiceAutomatically routing negative feedback to priority queues

9. Python Code


# Install: pip install vaderSentiment textblob scikit-learn transformers

# --- Approach 1: VADER (Lexicon-based) ---
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

analyzer = SentimentIntensityAnalyzer()
texts = [
    "This product is absolutely amazing! 😊",
    "Terrible quality, complete waste of money.",
    "The battery lasts about 6 hours.",
    "NOT good at all!!! Very disappointing."
]

for text in texts:
    scores = analyzer.polarity_scores(text)
    compound = scores['compound']
    sentiment = 'Positive' if compound >= 0.05 else ('Negative' if compound <= -0.05 else 'Neutral')
    print(f"{sentiment:10} ({compound:+.3f}): {text[:50]}")

# --- Approach 2: ML-based (Logistic Regression + TF-IDF) ---
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.model_selection import cross_val_score
from sklearn.datasets import load_files

# Sample training data
train_texts = [
    "excellent product highly recommend",
    "amazing quality loved it",
    "best purchase ever made",
    "terrible product broke after one day",
    "worst quality never buying again",
    "complete waste of money disappointed"
]
train_labels = [1, 1, 1, 0, 0, 0]  # 1=Positive, 0=Negative

pipeline = Pipeline([
    ('tfidf', TfidfVectorizer(ngram_range=(1, 2))),
    ('clf', LogisticRegression(random_state=42))
])
pipeline.fit(train_texts, train_labels)

test_texts = ["really good experience", "awful and broken"]
predictions = pipeline.predict(test_texts)
probabilities = pipeline.predict_proba(test_texts)
for text, pred, prob in zip(test_texts, predictions, probabilities):
    print(f"{'Positive' if pred==1 else 'Negative'} ({prob[pred]:.2f}): {text}")

# --- Approach 3: BERT-based (HuggingFace) ---
from transformers import pipeline as hf_pipeline

# Use pre-trained sentiment model (no fine-tuning needed for general sentiment)
sentiment_model = hf_pipeline("sentiment-analysis",
                               model="distilbert-base-uncased-finetuned-sst-2-english")
results = sentiment_model(["I love this!", "This is terrible."])
for r in results:
    print(f"{r['label']} ({r['score']:.3f})")
    

10. Frequently Asked Questions

How do I handle mixed sentiment in reviews?

Mixed sentiment is best handled by aspect-based sentiment analysis (ABSA) — identifying the specific aspects mentioned (battery, screen, price) and classifying sentiment for each aspect separately. For simple polarity, you can split the review into sentences and classify each sentence independently, then aggregate. BERT-based models handle mixed sentiment better than TF-IDF models due to contextual understanding.

What dataset should I use to train a sentiment model?

Popular public datasets: IMDb Movie Reviews (50,000 reviews, binary), SST-2 Stanford Sentiment Treebank (fine-grained, 11,855 sentences), Amazon Product Reviews (millions of reviews with star ratings), Yelp Reviews (5 million reviews). For Indian social media, SentiRaama (Hindi) and datasets from SemEval multilingual shared tasks are available.

Next Steps