Here are 100 command prompts for performing sentiment analysis, categorized by the typical project workflow.
1. Data Collection & Acquisition
Scrape [product reviews] for [product name] from [e-commerce site].Fetch the last [N] tweets containing the [hashtag or keyword].Query the [database] for all [customer support tickets] in the last [time period].Download [app store reviews] for [app name] from the [iOS/Android] store.Load the [CSV/JSON] file containing [survey responses].Extract all [Reddit comments] from [subreddit] mentioning [brand name].Gather [hotel reviews] from [TripAdvisor/Booking.com] for [hotel name].Collect all [political op-eds] from [news source] on [topic].Assemble a corpus of [movie reviews] from [IMDb/Rotten Tomatoes].Extract text from the [customer_feedback] column in the [SQL table].
2. Preprocessing & Data Cleaning
Convert all review text to lowercase.Remove all URLs, HTML tags, and user mentions from the text.Remove all punctuation and numerical digits.Filter out all stopwords (e.g., 'the', 'is', 'a') in [language].Perform lemmatization on all words (e.g., "running" -> "run").Perform stemming on all words (e.g., "running" -> "run").Expand all contractions (e.g., "don't" -> "do not").Correct common misspellings found in the corpus.Normalize all emojis and emoticons (e.g., ":)" -> "positive_emoji").Filter out reviews that are too short (e.g., < 3 words) to be meaningful.
3. Lexicon-Based (Rule-Based) Analysis
Calculate the sentiment polarity score for each review using TextBlob.Calculate the sentiment compound score for each review using VADER.Generate a sentiment score using the [AFINN-111] lexicon.Identify the [N] most frequent positive words from the [Hu & Liu] lexicon.Identify the [N] most frequent negative words from the [Hu & Liu] lexicon.Analyze the impact of negation words (e.g., 'not', 'never') on VADER scores.Analyze the impact of intensifiers (e.g., 'very', 'extremely') on VADER scores.Plot a histogram of the VADER compound scores to see the distribution.Classify text as [positive/negative/neutral] based on [TextBlob/VADER] score thresholds.Create a custom sentiment lexicon for [industry-specific jargon].
4. Machine Learning - Data Preparation & Labeling
Manually label [N] reviews as [positive, negative, neutral] for a gold-standard dataset.Convert [1-5 star ratings] into sentiment labels (e.g., 1-2=Neg, 3=Neu, 4-5=Pos).Split the labeled dataset into [80% train, 10% validation, 10% test] sets.Balance the imbalanced dataset using [oversampling (SMOTE)].Balance the imbalanced dataset using [undersampling].Create a document-term matrix (DTM) using Bag-of-Words (BoW) counts.Create a DTM using TF-IDF vectorization.Generate word embeddings (e.g., Word2Vec) for the training corpus.Generate document embeddings using a pre-trained [BERT] model.Compare the performance of [TF-IDF] vs. [BERT embeddings].
5. Machine Learning - Model Training
Train a [Naive Bayes] classifier on the [TF-IDF] features.Train a [Logistic Regression] classifier on the [TF-IDF] features.Train a [Support Vector Machine (SVM)] classifier.Train a [Random Forest] classifier and analyze feature importance.Train a [Recurrent Neural Network (RNN)/LSTM] on the text sequences.Fine-tune a pre-trained [BERT/RoBERTa] model on the sentiment task.Perform hyperparameter tuning (e.g., Grid Search) for the [Logistic Regression] model.Train a multi-class classifier to predict the exact [1-5 star rating].Build a binary classifier to predict only [positive vs. negative].Compare the training time of the [Naive Bayes] vs. [BERT] model.
6. Machine Learning - Model Evaluation
Generate a confusion matrix for the test set predictions.Calculate the overall accuracy of the sentiment model.Calculate the precision, recall, and F1-score for the [negative] class.Calculate the weighted-average F1-score across all classes.Plot the ROC curve and calculate the AUC score for the [positive] class.Establish a baseline performance (e.g., majority class, lexicon model) for comparison.Perform K-fold cross-validation on the model.Identify the top [N] most positive features (words) from the [Logistic Regression] model.Identify the top [N] most negative features (words) from the [Logistic Regression] model.Perform an error analysis: review the top [N] false positives.Perform an error analysis: review the top [N] false negatives.Use LIME or SHAP to explain the prediction for a [single misclassified review].
7. Advanced Sentiment Analysis
Perform Aspect-Based Sentiment Analysis (ABSA) to find opinions on specific features.Extract all product aspects/features (e.g., "battery", "screen", "service").Assign a sentiment score to each extracted aspect (e.g., "battery": positive, "screen": negative).Perform emotion classification (e.g., joy, anger, sadness, fear, surprise).Plot the distribution of emotions found in the [customer support tickets].Identify the top keywords associated with the "anger" emotion.Perform subjectivity/objectivity detection to filter out factual statements.Build a model to detect [sarcasm/irony] in the text.Perform intent classification (e.g., "complaint", "question", "praise").Extract all opinion-target pairs from the text.Analyze the sentiment of text containing [emojis and emoticons].Perform sentiment analysis on [multi-lingual text] using a [multilingual model].Filter for [toxic/hateful] language within the negative comments.
8. Time-Series & Trend Analysis
Plot the average sentiment score per [day/week/month].Analyze sentiment trends before and after a [product launch/PR event].Identify the date with the lowest average sentiment (sentiment trough).Plot the volume of positive vs. negative reviews over time.Detect anomalies or sudden shifts in sentiment using a time-series model.Correlate sentiment trends with [sales data] or [stock price].Analyze the changing sentiment of [brand name] over the past [year].Compare the sentiment trends of [Product A] vs. [Product B].Identify emerging negative topics by tracking n-gram frequency in negative reviews.Track the sentiment of [competitor brand] over the same period.
9. Reporting, Visualization, & Action
Create a bar chart of the overall sentiment distribution (Positive/Negative/Neutral).Create a word cloud for the [most common positive] words.Create a word cloud for the [most common negative] words.Calculate a Net Sentiment Score (NSS) (e.g., (%Pos - %Neg) / (%Pos + %Neg)).Summarize the top 3 drivers of negative sentiment.Summarize the top 3 drivers of positive sentiment (what customers love).Identify the product aspect (from ABSA) with the *worst* sentiment score.Identify the product aspect (from ABSA) with the *best* sentiment score.Formulate a data-driven recommendation for the [product team].Formulate a data-driven recommendation for the [customer service team].Build an interactive dashboard to display key sentiment metrics.Prepare a report for stakeholders summarizing all key findings.Set up an alert system for a sudden spike in [negative sentiment volume].Prioritize [customer complaints] for follow-up based on sentiment severity.Calculate the ROI of improving sentiment for a [specific product aspect].
No comments:
Post a Comment