How to Corrupt Databases and Ruin Performance

16 Tuesday Sep 2014

Posted by Bradley Ross in Software Development Life Cycle

Tags

Although nobody wants to build slow and corrupt databases, it seems that many people do a very good job of slowing down and corrupting their databases. I have listed some of the methods they use below. (In case my sarcasm is not apparent, I am suggesting that you avoid the actions indicated by the section headings.)

Note: In the following discussion, I have included links for some of the terms to items in Wikipedia. This is done for convenience. The SQL for entering the sample database shown below can be found by clicking here.

Don’t use constraints

All of the relational database management packages allow you to apply constraints on a database. These are rules that must be satisfied before a new record is inserted or a change is made to an existing record.

Not null – A specification of a column in a table that may not contain null values.
Uniqueness – A set of columns in a table whose set of values must be unique within the table. These sets of columns are called candidate keys, with one of the sets being the primary key.
Foreign key – For every given record in one table, there must be a corresponding record in another table. For example, records in an inventory list may contain part numbers. For each record in the inventory list, the part number must agree with a record in the list of part numbers.

The following is an example of SQL code that will use constraints when setting up the tables.

CREATE TABLE COUNTRY ( COUNTRY VARCHAR(2) NOT NULL, NAME VARCHAR(40), PRIMARY KEY (COUNTRY) ); CREATE TABLE STATE ( STATE VARCHAR(2) NOT NULL, NAME VARCHAR(40), COUNTRY VARCHAR(2) NOT NULL, PRIMARY KEY (COUNTRY, STATE), FOREIGN KEY (COUNTRY) REFERENCES COUNTRY(COUNTRY) ); CREATE TABLE CITY ( CITY VARCHAR(40) NOT NULL, COUNTRY VARCHAR(2) NOT NULL, STATE VARCHAR(2) NOT NULL, PRIMARY KEY (COUNTRY, STATE, CITY), FOREIGN KEY (COUNTRY, STATE) REFERENCES STATE(COUNTRY, STATE) );

Identifying and inserting the constraints does take time, and the following are the reasons that some people have used for not implementing them.

If the code is written correctly, none of the records in the database will violate the constraints. Therefore, they argue, time spent in applying constraints is time wasted. However, how can you insure that the code is written correctly if you haven’t identified all of the constraints? Furthermore, it is never safe to assume that your code is error free. As the saying goes, “trust, but verify”.
Sometimes the data in the database is already corrupt. By not applying the constraints, you can keep operating until some time in the future when you clean up the data. However, as time goes on, it becomes more and more difficult to clean the data, and that time in the future never seems to come. Furthermore, processing corrupt data can have strange results.
I have seen people apply both of the first two arguments simultaneously. The code was incorrect in the past and corrupted the data, but is now correct and no further corruption will take place. My counterarguments for both of these cases still apply.

Even if your current programmers have achieved perfection, there is no assurance that programmers in the future will share this blissful state. Furthermore, if you haven’t identified all of the constraints, how can you check that your software is correct.

In addition to preventing corruption of the database, these constraints are also used by the optimizer in the database management system to find the best way of accessing the data. (There are often multiple sets of indices that can be used to locate the records, and the job of the optimizer is to find the most efficient set, which is referred to as a strategy.) Failure to implement the constraints will result in non-optimal strategies being selected. With some database managers, you can get around this by inserting “hints” in the data requests. These hints override some of the heuristic algorithms in the optimizer. However, I have a few problems with hints.

In every case that I have seen, the use of hints was required because the constraints were not fully applied.
Valid changes to the database can cause the hints to become invalid.
Depending on the constraints in the database, the database manager may ignore some of your hints. The database manager knows that the constraints are true, and will often apply more weight to the constraints than to the hints when determining a search path.

There is another reason for using foreign keys. The API’s for databases can examine the metadata and automatically generate an entity relationship diagram using the foreign keys.

Don’t Use Indexes

Indexes can speed up your transactions, but there are a few things to consider.

If the records are inserted infrequently but read very often, indexes will speed performance tremendously, and the creation of additional indexes doesn’t pose much of a problem. If the data is read infrequently, but records are frequently inserted, additional indexes can impose high demands on your system, and may not save you much. However, you will still need indexes on the primary and candidate keys to insure that the database isn’t corrupted.
Hashed keys are only useful if you know the values of all of the columns in the index. This may seem obvious, but it is sometimes overlooked.
For sorted (ordered) keys, the order of the columns is significant. For a table that is read very often, you may want additional ordered indexes. For example, a table of medical activities might have the patient as the first item in one index and the provider as the first item in another index. In addition, adding additional columns to the end of an ordered index doesn’t introduce a heavy burden on the system, but may be useful.
The impact of the indexes will change as the database grows. Make sure that you test the indexes on a data set that is comparable in size to the production data.

In some cases, the database manager may add indexes based on the primary and foreign keys. In other cases, it may search the entire table if keys aren’t provided.

Don’t Use Transactions

Transactions allow the system to insure that SQL queries are executed properly. When you group queries within a transaction, all of the queries will be executed or all will be cancelled. However, some people still don’t use transactions. Another problem is that some people misuse transactions. The period between the start and end of a transaction should be kept short. If the database operations require user interaction or extensive preparations, prepare the data first and then carry out the data modifications in a single short transaction. This will reduce the chance of the database locking up.

Don’t Document

Documentation for a database serves a number of purposes.

A data dictionary defines the tables and columns in a database and enables all of the people working with the database to share a common understanding of the schema.
Creating documentation forces you to think about how the database is organized, and can help to avoid errors. It also allows others to review the design.
It reduces the effort required and the number of mistakes that are made when changes are made to the system.

Before you begin the design, you need to identify the most important tables for the application. These are the tables where changes will require major changes to the system, and they need to be defined as soon as possible. For these tables, you will need a list of the columns, the primary and alternate keys, and the foreign keys relating the tables.

Some parts of the documentation will be required for testing of the system and its components. If you are going to test for corruption of the database, you will need to know the constraints on the database.

These hierarchical sets will contain most of the key tables in the database. Changes to these tables will normally involve massive reworking of the code processing the database. Your data dictionary will need to include these tables first.

Ignore Errors and Crashes

Some people will say that it doesn’t matter if a system freezes so long as it works when you restart it. However, if you don’t know why the system froze, how do you know that the data hasn’t been corrupted. (Hopefully, the proper use of transactions will help avoid problems.) If the system crashes or produces error messages, you really need to know what is happening. The story of the Therac-25 is often used as a cautionary tale of what happens when you ignore error messages. (In this case, ignoring error messages resulted in a number of people dying from radiation exposure.)

When you run the database queries from your programs, be sure to check the return codes. If you are using Java, do catch blocks for expected return codes first. Unexpected return codes should generate a log entry. I have seen some people make assumptions about the error being returned. For example, they assume that an error code returned from a “CREATE TABLE” command will either be success or that the table already exists. Assumptions can be devastating.

There may also be questionable areas that you want to mark down for future review. For example, if you find hard limits for some values, you may want to make a list for consideration in future updates.

Don’t Run Sanity Checks

When two tables are linked together using a one to many relationship, the constraint doesn’t restrict how many records in the second table can be linked to the first table. The following code is an example for testing the relationship between a table containing countries and a table list states/provinces within each country.

mysql> SELECT MULT, COUNT(MULT) AS FREQ -> FROM -> ( SELECT COUNT(COUNTRY) AS MULT, COUNTRY -> FROM STATE GROUP BY COUNTRY ) FIRST -> GROUP BY MULT -> ORDER BY MULT;

MULT	FREQ
13	1
51	1

This indicates that there is one country with 13 states (Canada) and one country (USA) with 51 states. If I had a database using 300 countries and the query said that one country had 3000 states while all of the others had under 100, I would be very suspicious of the country that had 3000 states.

For large databases, these queries can take a long time to run but you only have to run them during testing sessions. Some of the outliers may actually be legitimate, but looking at the outliers will reduce the number of cases that have to be examined.

The goal of these sanity checks is to identify statistical anomalies in the data. Upon inspection, some may be valid, but some may point to problems in the underlying software.

And some may point to problems in other areas. There was a bank where a few employees were carrying out fraudulent transactions for cash values that were just below the trigger level for audits. If they had been running statistical distributions of the value of the transactions, the bank would have been able to detect this problem before they lost a few million dollars.

Don’t Allow Users to Make Temporary Tables and View

In addition to the database administrator, there are other programmers that use the database. If they aren’t allowed to generate their own tables and views, it can make their job much more difficult. (These temporary files are placed in a separate schema, so that they can’t affect the actual production database.) There are workarounds to not being able to create these, but they are messy and expensive. Furthermore, if the creation of temporary tables can corrupt the production database, you have very big problems.

Worship the Optimizer

When you run an SQL query, there are often multiple ways of searching through the tables for the desired information. The optimizer selects what it assumes will be the best method for the search. Some people assume that the optimizer is so good that they don’t have to worry about the quality of their code. The optimizer is not magic, and it is essential to understand it and its limitations. (See Freefall, strip 255, http://freefall.purrsia.com)

It is preferable to give statements to the database manager rather than hints. Since it knows the statements are true, it can make many inferences. Hints are viewed as “might be true”. The statements are the constraints on the tables, the contents of the indexes, and the size of the tables.

Don’t Update Software

If you don’t apply updates to the drivers and database managers, very strange things can happen. One major item is that security holes are fixed, and this often includes security holes that the vendor never told anybody about.

Of course, you have to test the software when you carry out updates. But you have to test it anyway, and the testing may reveal problems unrelated to the updates. And if you do find errors and haven’t updated the software, there is a chance that the problems are fixed by the update and aren’t actually due to your code.

Why My Code Works

28 Thursday Aug 2014

Posted by Bradley Ross in Attitudes, Software Development Life Cycle

≈ Leave a comment

Tags

development guidelines

My code works, and I’m not satisfied until the user is satisfied. I’m not saying that I’m a miracle worker, but I do have a number of principles that have enabled me to succeed where others have given up.

Understand the user

I started in engineering and have worked with management, software development, research, operations, sales, and a variety of other groups. It may take a little work, but you can eventually get them to tell you what they want. It amounts to getting their work done faster and with less effort, making customers and management happy, and avoiding surprises. There are differences, but there are far more similarities. Also remember that the developer will have an easier time understanding the user than the user will have understanding the software.

When the user describes what he thinks is a bug, listen. When working in the chemical industry, I often found that the operators were correct about what was happening, even if they were wrong about why it was happening. “He yelled at me” is not a good reason for not listening. I usually take it as a sign that he cares about whether things work, and he will yell a lot less if I listen to him.

Know what’s easy and what’s difficult

One of the biggest problems that I have seen when communicating with users is that they don’t understand what’s easy and what’s hard in the development of software. There are often things that are easy to do in the software but very difficult for the user. There was one application where the mail room received multiple copies of a few dozen reports and they had to create packets for each of the recipients. I changed the software to collate the reports by recipient and the mail room was ecstatic. It only took me a few days but it saved them weeks of effort. They didn’t realize that the reports could come off the printer sorted by recipient.

On the other hand, some things are very difficult to do in software, such as voice recognition software and character recognition for handwritten notes. If you explain the problems, you may find that the requirement is not as important as originally reported. Furthermore, there may be an alternative that’s easy for software to carry out and still meets their needs. For example, it would be difficult to extract the “From” field on hand written notes sent via facsimile. On the other hand, the computer can sort them by the telephone number of the fax machine that sent the note.

Use your experience

I’ve been working in software for many years and have seen many problems. There are very few that are entirely new. A fellow programmer was having problems with a program that was going into an infinite loop and asked for my help. I took a quick look at the program, pointed to a line and asked him if that was what he really meant. He looked at it, and asked how I could have read the program so fast. I told him that I didn’t know what the program did, but the line was a complicated mass of nested arithmetic and logical operations, and that was where errors resulting in infinite loops were usually found. I had a few other cases where the other person was amazed at what appeared to be magic. In almost all of the cases, my explanation was the same: that was what the problem was the last few times I encountered it.

If your experience can’t solve it, ask others. Don’t be afraid to let people know that you need help. In one case, I asked for help with an algorithm for processing medical data. When I explained how my code worked, he said that I was using the wrong reference value for one of the measurements. When I followed his suggestion, I obtained the correct answer. It turned out that there was an error in the reference documentation for the equipment.

In another case, a magazine article mentioned how there was a requirement that an equipment room be accessible in an emergency, but entrance had to be prevented at other times. (The employees were using it as an unofficial smoking lounge.) The article went on to state how it was impossible to satisfy both requirements, a statement that I found hilarious. You simply put an alarm on the door. People know that opening the door will set off the alarm, and they won’t use it unless it is an emergency.

Not a perfectionist

I am not a perfectionist. I merely want a close enough approximation such that nobody else can tell the difference. It’s actually not as difficult as it sounds. Management will sometimes tell you to only handle the serious problems, but there is no way to tell with certainty which problems are the serious ones. There was a program that could be compiled for two different computer architectures, but it was only working for one of them. I found an addressing error in the code that caused it to crash on one system but didn’t cause any noticeable problems on the other. When I fixed the bug, it worked fine on both systems. The problem is that I don’t know if the results were correct on the system where it was working. Ignoring error messages has been known to cause deaths in some situations. (See Therac-25) If web application crashes once a week and has to be restarted, do you really know what happens when it crashes?

When I code web pages, I turn on the development tools that lists all of the errors in the code, even if it is able to continue. You would be amazed at how many errors appear, most of them being typographical errors in JavaScript code and tag attributes, which are easy to fix. However, the results of these errors are unpredictable and undefined. A change in the version of a browser can suddenly cause the web pages to fail completely.

It may be that time constraints prevent you from fixing some of these problems. However, they should be noted and fixed when you have the time.

Religious fanaticism

When you design a database, there are some rules called Codd’s Rules of Normalization. Sometimes you have to break the rules in order to improve performance. I remember a heated debate with a database “expert” who argued that you always follow the rules even if it renders the application non-functional. This is what I call religious fanaticism in software engineering. (By the way, data warehouses are a violation of the normalization rules.) I am not a religious fanatic. In this case, I believe that the reason for the violation of the rules should be stated in the documentation, together with possible problems and how they should be handled.

Other fanatics will specify the use of frameworks or software libraries because they feel that that is the current technology for all applications. Using a tool or method because other software components work that way may actually be justified because it limits the number of packages for which the maintainers have to maintain proficiency. However, using packages such as Spring, Struts, JavaServer Faces, Facelets, or the current fad of the day because it is the current fad of the day is rarely a good idea.

Fanatics will also tend to mention things like agile programming, extreme programming, Six Sigma, and the SEI’s Capability Maturity Model as if the proper incantation of the mystic rituals will achieve success. These are frameworks for thinking about processes, and, in my opinion, the key word is “think”.

Look for tools, not solutions

I will often use Google or go through my reference books to find a solution to a problem. However, I don’t really expect to find the solution. What I expect to find are the tools that will enable me to build a solution, and I keep looking until I find them. That is how engineering really works. The reason that most people give up is that they can’t find a solution and using the tools to build a solution is hard work. When you look at it, the engineering process works for both bridges and software and can be thought of as having the following steps.

Observation – Collect relevant observations about the problem
Decomposition – Break the problem into smaller pieces
Analysis – Examine the pieces and how existing tools can deal with them
Synthesis – Construct a design for the solution
Implementation – Convert the design into a solution

Actual Requirements

26 Tuesday Aug 2014

Posted by Bradley Ross in Software Development Life Cycle

≈ Leave a comment

Tags

development guidelines

It has been said that a journey of a thousand miles begins with a single step. However, unless you have a compass and a map, those first few steps are unlikely to get you nearer your destination. However, following the map without question may also fail to get you to your destination. There have been a number of cases where truck drivers have blocked traffic by ignoring clearly marked height and weight restrictions. The police have been generally unimpressed by the statement that the driver had to keep going because that was what Google Maps or Mapquest told him to do so.

When creating a software application, the requirements document can serve as a map showing your destination and how far you have to go. However, there can be problems.

I have encountered people who ask what we have to do so that we can declare the requirement has been satisfied. My attitude is that you look for what has to be done to satisfy the requirement. (Some people really don’t see a difference.) They assume that they can convince the user that the requirement has been met, even though it hasn’t. If the statement comes from your manager, you may not have much of a choice, but don’t be surprised when the user isn’t happy. If you can’t meet a requirement, you have to deal with it. Determine the impact of not meeting the requirement and see what you can do to either meet the requirement or reduce the impact of not meeting the requirement. At the very least, don’t let it catch him by surprise. (Surprises are very unpopular.) The user may decide that the requirement can be eliminated or replaced with with something else.
In addition to the requirements in the document, there are many implied requirements. These are the requirements that seem so obvious, they don’t need to be said. However, that being said, it is probably best that they are said, and you should consider adding them to the requirements list. An example might be having electrical power and an internet connection where the computers are to be located. Even if it isn’t your requirement, it is somebody’s requirement. (When I informed one user that I couldn’t install a computer system because there was no internet connection in their building, they informed me that I should stop by the stock room on the way over and pick one up. This is the equivalent to installing an electrical outlet without having it hooked up to power.)

When I look at an application, I think about the abilities of the software. Whether or not they are in the formal requirements document, you want to think about these things. I have listed a few of the items below, and you may find these items useful.

User oriented

I’m not a perfectionist when it comes to user demands. I merely desire a close enough approximation so that the user can’t tell the difference. It’s actually easier than you think.

Capability – There is something that the software is supposed to do. Does the software have the capability to do it?
Usability – Is the user able to use the software to do his work? Does the software have built-in documentation, or does he need to have a manual ready when he uses it? (If the user needs to have a few manuals sitting next to him, that says something about the lack of usability.)
Flexibility – Does the user have to change how he does work in order to use the software? Can different users operate the system in ways that are convenient for them?
Likability – Does the user feel comfortable using the software? After a few hours, do you hear screams of despair from his cubicle accompanied by the sounds of breaking pottery? (I am speaking quite literally here.)
Verifiability – After carrying out a task, is the user able to verify that he did it correctly? This will go a long way towards user satisfaction.
Reliability – Will the system be available when the user wants it?

Maintenance oriented

Things will go wrong from time to time. If the system can’t be fixed, expect problems.

Understandability – Can the people who maintain the system understand how it works? Is it well documented, including the source code and the instructions for the software builds?
Traceability – If something goes wrong, can the people maintaining the system identify the piece of code causing the problem?
Audit-ability – Can the people who maintain the system verify that it is actually doing what it is supposed to be doing? If it isn’t doing the right thing, can they determine what it is actually doing? If there are operational problems, are the operators informed or must they wait for the sound of angry users?
Restorability – If the system crashes, can it be restarted? Can it be verified that the restart doesn’t introduce errors?
Survivability – If unexpected events occur, will the system keep running? What happens when there are network problems or an unexpected load on the system?
Administrate-ability – Can the operator make administrative changes to the system or do they require changes to the code?
Isolatability – Can changes be tested on a test platform before being placed in production? (Isolating the development, test, and production environments.)

Change is a constant

I have frequently heard that there will be no further changes to a system. However, there are always more changes.

Modifiability – How hard is it to make changes to the system? For example, how hard is it to change the database used by the application? What is involved in making changes to the wording of menus? Can the system be moved to another platform?
Scalability – If the workload increases by an order of magnitude, what is required to keep it running?
Mobility – Can the software system be moved to another system, and how much effort is involved? I was involved in one set of applications that hard-coded the IP addresses in the software. When the network was reconfigured, the IP addresses had to be changed, and they had to change all of the code for the system.

The User Has a Problem

16 Sunday Mar 2014

Posted by Bradley Ross in Software Development Life Cycle

≈ Leave a comment

Tags

development guidelines

When trying to resolve user complaints, one of the biggest problems is to insure that the necessary information is communicated between the user and the help desk. The purpose of this entry is to provide a check list for the help desk in order to enhance communications.

The first thing is to describe the problem. Start by getting the answer to the following questions.

What was the user trying to do? The answer to this question may affect your understanding of the answers to the other questions.
What is the application with the problem? This may seem obvious, but users often don’t know what application they are using, especially if they start it by clicking on an icon. If there is a question, the following are some methods that can be used for identification.
- If the information was received on paper, have them send copies of the first page and the page with the questionable data.
- If the information was received in an e-mail have them forward the e-mail to you.
- If the information was received using a web application with a browser, have them send the URL used to start the application, a screen shot of the starting page of the application, and a screen shot of the page with the questionable information.
- If it is a GUI application with a menu bar, the About menu item should provide information about the application.
- If the application is launched using a batch file or script, the contents of the file will provide information may provide information about the application.
- Screen shots of the opening page of the application and the page with the questionable information. The name of the program or the text for the icon would also be useful.
What did the user do? Try to get as much information about the steps the user took. Tools for remote access to the screen can be very helpful.
What happened?
What did the user expect to happen?

Now that you have the initial information, there are a few immediate steps that can be taken.

You may be able to resolve the request immediately. Even if it isn’t really your area, answer it and you can close the ticket. For example, they may simply need to know how to sort a column in Microsoft Excel. The user will be happy and you will be happy.
Have the user check his connection to the internet by going to some web sites both inside and outside the firewall. Determine if the user can send and receive e-mail, and whether he can access shared drives. Problems in these areas should be resolved before going further.
Determine if the problem should properly go to somebody else. If so, notify the user where you are forwarding the request and that he can contact you again if nobody contacts him. If the user has to take action to get the problem to the right group, let him know what to do.
It may be that the user is actually getting the correct information, but doesn’t realize it. In an insurance application, the report could be aggregated by state of residence for the patient or the state of the processing office handling the claim. Some users didn’t fully understand the difference. Explain the situation to the user and ask him if that satisfies his request.
If the user is running the application through a web browser, have him open the error console and repeat the process. This can sometimes give you useful information.

If you haven’t resolved the problem at this point, you are going to have to do some research. The following information from the user will aid in this process.

Is the problem repeatable? If the user carries out the same steps, will the same problem occur. Did the program ever work correctly in the past? If so, when? If he uses a different workstation, does the same problem occur?
Have there been any recent changes? For example, is he using a new workstation or has there recently been maintenance or upgrades on the workstation? Is he operating the application in a new location or using a different network?
Get some information on the system being used. This would include operating system (Linux, Windows, Mac OS X, Sun UNIX, etc.), the version of the operating system, the type and version of the web browser (Google Chrome, Microsoft Internet Explorer, Firefox, Opera, Chrome, etc.), and anything else that seems appropriate.

And most importantly, tell the user that you will contact him when you learn anything.

Bradley Ross' Blog on Life and Computer Software

~ What I've learned in IT

Tag Archives: development guidelines

How to Corrupt Databases and Ruin Performance

Don’t use constraints

Don’t Use Indexes

Don’t Use Transactions

Don’t Document

Ignore Errors and Crashes

Don’t Run Sanity Checks

Don’t Allow Users to Make Temporary Tables and View

Worship the Optimizer

Don’t Update Software

Why My Code Works

Understand the user

Know what’s easy and what’s difficult

Use your experience

Not a perfectionist

Religious fanaticism

Look for tools, not solutions

Actual Requirements

User oriented

Maintenance oriented

Change is a constant

The User Has a Problem