<< back to blog

Database Garbage Collection (for Business Entities)

software

23 Jun, 2008

bin When I walk around customer offices, I see that they have rules for cleaning stuff up:

If an employee starts filling out an annual-leave request form, but then changes their mind, the form goes in the bin. If a half-completed pension application form never gets sign off, it might be filed for 3 years, and then the IFA puts it in the bin.

As soon as you write something on a piece of paper, it is persistent. Paper is a reliable storage mechanism for keeping your "Anuual Leave Request" around, even if you need to nip out for a 3 hour meeting.

Should software applications do things differently? Shouldn't our web forms all auto-save? This led me to think that business entities should be saved at any time, regardless of whether they are valid or not.

If saving entities at any time is a good idea, then we need some way to clean up the ones that are forgotten, abandoned, or lost. Otherwise our databases will just fill up with crap that is of no use or interest. The business will dictate how and when things are cleaned up, because it knows the most efficient and legal ways of keeping things tidy.

If you believe that most business rules belong in the business entity, then you might agree that this kind of business knowledge should gravitate around business entities also. I'm calling this topic Entity Garbage Collection for now. That said, if you're entities are persisted into a relational database, then it's also a kind of Database Garbage Collection. Entity Garbage Collection

If I were implementing Entity Garbage Collection, I'd probably want something like the following:

public class PensionApplication
{
   State[] states = new(){ Draft, Approved, Rejected, Archived };

    //anything in Draft state that's not been touched 
    //for 3 months gets binned
    void GarbageCollect_Draft()
    {
        if( UpdatedAt > 3.Months.Ago )
             GarbageCollect(this);
    }  

    //anything in archived state gets binned after 6 years
    void GarbageCollect_Archived()
    {
        if( ArchivedAt > 6.Years.Ago)
             GarbageCollect(this);
    }
}

Dada! We now have fire-and-forget entity GC; you can specify garbage collection rules for each state that the Pension Application is in. One thing I like about this is that it makes entity garbage collection a first class citizen in the domain model, and clearly visible for each entity.

How and when the GarbageCollection() is called is a separate problem.

Perhaps the DB might have a sproc that runs nightly to delete any entities in the "Garbage" state. You could write some background task would run every night/week to run the GarbageCollection checks on each entity. Alternatively, you might use .NET attributes to specify GC rules, and have a GC subsystem reflect on those Attributes and build it's own, more performance friendly, routines in the database.

As with my other random musings this week, I'm just thinking out loud and haven't actually tried any of this. Hopefully I'll get to play with it in sandbox sometime soon (probably along with some other ideas I've been toying with, and I'll be using NHibernate too I expect).

Thoughts?

Other, equally random, but related musings this week are:

You may also like...
Man makes nice software. Sells it. Makes $100,000 in 5 months
TODO.txt 2008 Ultimate Team Edition
Pretty Simple Software
Loose Coupling: Quote de jour
Cheap as Chips private Git hosting
Putting Git in the cloud with Amazon S3
Friendliest console installer EVER!
What Open Source stack do the gurus use?
Nice use of Google maps in registration
Generate State Machine diagrams from your POCO Entities

kick it on DotNetKicks.com
blog comments powered by Disqus