Over the past week or so we have discussed the philosophy and reasons behind the philosophy in our software development methodology. Today we are going to dig into the philosophy and give a little detail without getting too specific. We will examine the specifics in later posts.

Design Philosophy

Use the Power of the Database

The database server is the most powerful machine on the network, and it is that way for a reason. Even more compelling is the fast that Set Based Coding is orders of magnitude faster than procedural coding. Imagine having the task of picking all the rocks out of a pile that has a diameter smaller than a quarter. With procedural coding you pick up each rock, examine it and keep or toss it relative to is diameter and how it compares to your test quarter. With Set Based coding, you grab a box, drill several quarter sized holes in it, put the rocks in the box and shake. Out the bottom drop all the rocks that meet the criteria.

Entities, Attributes and Relationships

Database Entities are no more than its constituent tables. These entities have attributes that correspond to the columns in the tables and the relationships are the constraints between the entities that say ‘you can only put this type of thing here’ so we don’t end up adding alternators where tires should go. Defining these things is the biggest failure developers make. When defining entities thoroughly describe it. Even more important, thoroughly define its relationships. Start with Boyce-Codd’s 3^rd Normal Form (3NF) or their Boyce-Codd Normal Form (BCNF) and decide if you want to de-normalize or super-normalize. One of the tricks is to avoid duplication of data, make your tables tall and skinny. Put all information that has a one-to-one relationship in the same table. Don’t save calculations, usually (save a birthdate, calculate an age, but save lifetime sales amount instead of recalculating every time there is a sale). Put things that can be picked from a list in a table by themselves and relate it back: a car has a color, there are many colors. The colors table says blue is identified with 7 so the car’s Color ID attribute is 7. But you can’t show the 7 to the user. More on that later. Know the difference between an Entity and a lookup table. The Color table would be a lookup table. The Engine table would be an entity table and relate back to the Car table through a linking table. There are many engines, there are many cars but any one car only has one engine. This information is held in the linking table. More on this later as well.

Tribal Knowledge and Codes

We all know the guy who can whip out the part number for every alternator from every Chevy ever built. These people however are few and far between and they should be. That is a rare and wonderful skill that should be reserved for baseball statistics and er, alternators. For the rest of us, let’s use English. Take Social Security numbers. They first three digits tell us what state we were in when the number was issued. It’s the same with area and zip codes. This is all tribal knowledge that really is just a stand in for something else like state or region. The way to avoid this (and this is happening with phone numbers) is to avoid tribal knowledge altogether and assign that number to what it really represents and talk about that thing, not the code. So really we don’t need to know phone numbers at all, we need to know how to get in touch with John or Sally.

Use English

…or whatever language you and your users prefer, but use it. Don’t use codes, don’t shorten words, and don’t be afraid of vowels. Make your table, column, object, property and method names descriptive and representative of what they contain or do. If the data itself is text, make the column/property wide enough to contain real, meaningful information.

Primary Keys and why they should be Hidden

The main reason for hiding primary keys is discourage tribal knowledge. Frequently the code for something becomes more important than the thing it represents. Then the users will start discussing things in terms of the primary key, particularly if it houses extrinsic data, as a surrogate for the thing it represents. So we have a bin for a part in the rear of an auto parts store with the part number on the shelf below it. When that part is discontinued, that shelf will sit empty for large amounts of time, even years. See the discussion about Ross below.

Always Use Stored Procedures for ALL Database Input/Output (I/O)

Using Stored Procedures gives you several things. First is security. If you only use stored procedures you can lock everyone out of the tables and thus avoid a dependency injection attack. Second, all the database code is in one place and it doesn’t require a deployment to change the logic. Third, following the tenets in this book, as much of the work as possible should be done on the database side. This is the database side.

Security

Custom Built Users and Groups

Not relying on built in models will confuse potential hackers and will guard against unknown frailties in the security model that could be found at a later time

Identifying the User’s Session

Once a user logs in, save the session information in the session and use it to authenticate the user based on his or her session later. Make sure to check the time the user has been logged in and to log them out after an appropriate time of inactivity

Additive v. Add/Delete or Add/Update

Always add state to the database.

Less Frequent Re-Indexing

If you never update or delete records from your database, you never have to re-index it.

Access to Historical data

If you add state, you automatically have access to historical data and with a simple UserID and DateChanged field, you can tell who changed what, when and what the old values were.

Set Based v. Procedural Code

Set based code should be used as much as possible and should replace procedural code. This means that the code in your application should run on the Database Server and the code in the middle tier should be as small and light as possible. For a discussion of Set Based v. Procedural coding go here. This is a discussion of procedural coding in TSQL but it applies to ALL kinds of procedural programming.

Middle Tier

The middle tier is only a procedural or object based representation of the database. We’ve given examples of how and why set based coding is faster than procedural code. Given that, write as little procedural, middle tier code as possible, only making the calls to the database to do the hard work.

User Interface

Like the middle tier the user interface should be as small and with as little code as possible. The only logic should be validating the data typed in by the user.

Automation

Not to get ahead of ourselves, but the ultimate goal here is to define a framework, and automate the production of software as much as possible. We know this is possible because we have done it.

Tomorrow, we will start examining the individual layers in this n-tier architecture we have defined above.

Comment

Date Written	Comment