I just migrated my blog to the latest version of BlogEngine.NET 2.5.0.6.
I had a shock when I saw the number of spam that I had on the blog!
447883 Spam! Wow. So I started the cleaning by using BlogEngine tools but it was damn slow, and no way to stop it when you started the delete all.
So I stopped the web site which was a bad idea because then one XML file was damaged. As I always do a backup before doing something like that I was on the safe side, and just reverted the files.
Then I used 7zip to zip the posts folder which is located in the App_Data which was 338 MB, again wow.
Downloaded the zip file on my local machine, installed BlogEngine and imported the post.
I thought it would be faster on my machine because it is a recent one. But still to slow to treat 447833 spam messages.
So as a developer I went on and wrote a little application to do it. And after cleanup the spam which took less than 10 seconds I went to this folder size of the posts
Quite a difference ! And BlogEngine showing me the results
And here is the code, it is using .NET Framework 4 and the parallelization of queries to treat files:
#region using using System; using System.IO; using System.Linq; using System.Xml; using System.Xml.Linq; #endregion namespace BlogEngineSpamDelete { internal class Program { private static void Main(string[] args) { var files = Directory.GetFiles(@"C:\Temp\blogengine\posts", "*.xml"); foreach (var file in files.AsParallel()) { FixPost(file); } } private static void FixPost(string file) { XDocument doc; using (var stream = File.OpenRead(file)) { doc = XDocument.Load(stream); } var comments = from comment in doc.Descendants(XName.Get("comment", String.Empty)) select comment; var spamComments = from comment in comments.ToArray() let data = new CommentState(comment.Attribute("spam").Value, comment.Attribute("approved").Value, comment.Attribute("deleted").Value) where ShouldDeleteSpamAndUnApproved(data) select comment; foreach (var spamComment in spamComments) { spamComment.Remove(); } using (var writer = XmlWriter.Create(file, new XmlWriterSettings {Indent = true})) { doc.WriteTo(writer); } } private static bool ShouldDeleteSpam(CommentState commentState) { return !commentState.Approved && (commentState.Spam || commentState.Deleted); } private static bool ShouldDeleteSpamAndUnApproved(CommentState commentState) { return !commentState.Approved || commentState.Spam || commentState.Deleted; } private class CommentState { public CommentState(String spam, String approved, String deleted) { Approved = bool.Parse(approved); Spam = bool.Parse(spam); Deleted = bool.Parse(deleted); } public bool Approved { get; private set; } public bool Spam { get; private set; } public bool Deleted { get; private set; } } } }
Update: I also posted the code on bitbucket: https://bitbucket.org/lkempe/blogenginespamdelete