Write a program that asks the user for a word. Next, open up the movie_reviews.txt file and examine every review one at a time. If a review contains the desired word you should make a note of the review score in an accumulator variable. Finally, produce some output that tells your user how that word was used across all reviews as well as the classification for htis word (any score of 2.0 or higher can be considered “positive” and any score less than 2.0 can be considered “negative). For example:
: happy appears 17 The average score reviews containing the word is 2.588235294117647This is a positive word
computes the average score for all words in the movie_reviews.txt file. You should do this by building a program that does the following:
- Set up a new dictionary variable called ‘words’
- Iterate over every review in the text file.
- Examine every word in every review and clean it up, if necessary. You should remove all punctuation and numbers from each word and replace them with empty strings.
- If this is the first time you have seen this word (i.e. it is not in your dictionary yet) you should add a new entry into your dictionary for that word (i.e. the word becomes a new key in the dictionary). The value to store at this key should be a list that contains two elements – the review and the number 1 (indicating that you’ve seen this word 1 time)
- If you have seen the word before (i.e. it is already in your dictionary) then you should add the new score into your list and increase the number of times that you have seen this word. for example:
4 I loved it1 I hated it=[‘loved’] ==[‘hated’] =
- Report to the user that the analysis of the ‘movie_reviews.txt’ file has been completed. Also give them a summary of how long this took (hint: import the time module and use time.time() to compute the current time before and after your analysis algorithm and then compute the difference). For example:
Initializing sentiment databaseSentiment database initialization completeTotal unique words analyzed: 16126Analysis took 0.19 seconds to complete
- Next, ask the user for a phrase. Analyze each word in this phrase and use your dictionary to compute the average score for each word. Also compute whether the overall phrase is positive or negative by averaging together the scores for each word that is contained within the phrase. For example:
Initializing sentiment databaseSentiment database initialization completeTotal unique words analyzed: Analysis took seconds to completeEnter a phrase to test: i loved it* appears times an average rating * appears times an average rating * appears times an average rating Average score phrase is: This is a POSITIVE phrase
Modularize your sentiment analysis program so far so that it’s easier to use. Create a function called ‘sentiment’ that takes one argument – a string of data. This function should ‘clean up’ the string and make it lowercase / remove punctuation. It should then compute a sentiment score for the string of text and return it. You can probably copy and paste almost all of this from the previous program that we wrote. Note: this function SHOULD NOT re-open the ‘movie-reviews.txt’ file and re-analyze it. Simply do this once at the beginning of the program and re-use the same dictionary inside of your function. Text your function using the following sample code:
a1 = sentiment(“The happy dog and the sad cat”)a2 = sentiment(“It made me want to poke out my eyeballs”)a3 = sentiment(“I loved this movie!”)print (a1, a2, a3) # 2.280133625200816 1.768915591909414 2.07085642181999