The first stage of using Bayesian filter is its training. During training the filter is shown a number of messages for each of which it is specified whether it is spam or not. Denote the set of messages that are spam as Spam and the set of other messages as Good.
Filter parses the messages to words and for each word w it calculates the probability that it is in a spam message:
and the probability that it is not in a spam message
After that for each word w in a message to classify the filter uses this information to calculate using Bayes’ formula:
Here and .Here let 0/0 = 0 (this happens if the word did not occur in training messages).
After that the ratio of words that indicate that M ∈ Spam with probability of at least 1/2 is calculated. If it reaches some threshold t the message is classified as spam. Note that each word is analyzed only once even if it occurs several time in a message.
In this problem you will have to implement Bayesian spam filter. You will be given the set of messages to train and after that the set of messages to classify. For each message to classify you will have to tell whether it is spam or not.
2 2 2 50 Buy our best computers! You will find our computers best! I have completed writing problems for trainings. I have problems with my computer. Computers? Solution! I do not know what to buy. Need help.