bayes

Go to file

Peter J. Holzer e6a4ba72f1 Smooth limits of spam probability Instead of clipping the probability at [0.01, 0.99] we just add 1 to each side. With my current corpus size this results in very similar limits (they will creep closer to 0 and 1 with a larger corpus, but never reach them) while avoiding having lots of tokens with exactly the same probability. This makes the selection by judge_message less random and more relevant (it prefers tokens which have been seen more frequently).	2019-08-17 11:32:59 +02:00
add_message	Implement basic idea	2019-08-17 09:29:11 +02:00
aggregate	Smooth limits of spam probability	2019-08-17 11:32:59 +02:00
judge_message	Avoid overlapping tokens	2019-08-17 11:12:34 +02:00

Peter J. Holzer e6a4ba72f1 Smooth limits of spam probability

Instead of clipping the probability at [0.01, 0.99] we just add 1 to
each side. With my current corpus size this results in very similar
limits (they will creep closer to 0 and 1 with a larger corpus, but
never reach them) while avoiding having lots of tokens with exactly the
same probability. This makes the selection by judge_message less random
and more relevant (it prefers tokens which have been seen more
frequently).

2019-08-17 11:32:59 +02:00

add_message

Implement basic idea

2019-08-17 09:29:11 +02:00

aggregate

Smooth limits of spam probability

2019-08-17 11:32:59 +02:00

judge_message

Avoid overlapping tokens

2019-08-17 11:12:34 +02:00