May CPI Data Out

The consumer price index (CPI) data for May was released yesterday, showing a two point spike in both the national and southeast averages. If we calculate the inflation rate for this year. All is not well.

In 2007, the entire inflation was 4.44% for the southeast and 4.08% for the nation as a whole:

Southeast = [ ( 203.457 – 194.8 ) / 194.8 ] * 100 = 4.44%
National = [ ( 210.036 – 201.8 ) / 201.8 ] * 100 = 4.08%

In just the first five months of 2008 we are rapidly approaching the same figures. The southeast inflation for January 2008 through May 2008 is 3.22%. The national inflation for the same period is 3.14%.

Southeast = [ ( 210.006 – 203.457 ) / 203.457 ] * 100 = 3.22%
National = [ ( 216.632 – 210.036 ) / 210.036 ] * 100 = 3.14%

If we look at these numbers we can estimate the total inflation for the year, ceteris paribus. By calculating the change from month to month for each region we are interested in we can arrive at an average change per month. If we then apply this to the rest of the months in the year we can figure inflation for the total period, giving us our estimate.

Here are the numbers:

National CPI Southeast CPI National ? Southeast ?
12/2007 210.036 203.457
01/2008 211.080 204.510 1.044 1.053
02/2008 211.693 205.060 0.613 0.550
03/2008 213.528 206.676 1.835 1.616
04/2008 214.823 208.085 1.295 1.409
05/2008 216.632 210.006 1.809 1.921
Average 1.319 1.310

So with a average change of 1.310 for the southeast and 1.319 for the nation, we can calculate the estimated inflation for the year by adding the current CPI for each region to the region’s average, times the number of months remaining in the year (7):

Est. Southeast CPI = 210.006 + ( 7 * 1.310 ) = 219.176
Est. National CPI = 216.632 + ( 7 * 1.319 ) = 225.865

We can now recalculate the inflation, using these latest numbers, to arrive at our estimated inflation for the year:

Southeast = [ ( 219.176 – 203.457 ) / 203.457 ] * 100 = 7.73%
National = [ ( 225.865 – 210.036 ) / 210.036 ] * 100 = 7.54%

So using our estimation, based on the current progress of the year, we are looking at total inflation for the year capping out around 7.73% in the southeast and 7.54% nationally.

Pretty scary, huh?

Threading Problems In Web Applications

“I don’t have much threading experience, I work on web applications.”

Ever heard this? Me too. In fact, I’ve heard it come from my mouth a time or two. My exposure to threading is limited as I have traditionally operated within the web applications realm, however it is important that even web application engineers understand how threads work and the importance of writing thread safe code.

The example
Let’s say we have a contrived web application that stores a set of data in the session. For our purposes we will assume that the sample application is storing email addresses in the session (which it is not – they are just dummy strings). The users of the application can perform many different functions on the stored data but two are prevalent: removing addresses from the collection and sending mail to each address.

When the user selects to remove an item from the collection the storage object is retrieved from the session and the item is found, then removed. When the user selects to send mail to each item in the collection the storage object is retrieved from the session and iterated over. As the application iterates through the collection it hits the database to retrieve the appropriate message and to update the user record with the number of emails they have been sent (the sample application just does a Thread.sleep(300) to simulate this).

The problem
Because of database lookups, socket creation, and more, sending the emails to the entire list can take a while – roughly 300 milliseconds per item – so many users right click and open this in a new tab while they continue the rest of their work. To simulate this the sample application has a link on the main page that pops open two windows, simulating the user removing an address from the list and sending emails to all in the list at the same time.

When the non thread safe version of the code runs the remove functionality deletes an item from the collection while the send functionality is still iterating through the items. This will, in most cases, cause an IndexOutOfBoundsException to be thrown. To fix this we simply must change the send code so that it either makes a defensive copy first or does something else to accommodate the possibility of the underlying data changing. In our case we will implement the iterations with an Iterator.

The code
Comments, imports, and more in all code shown in this post has been removed for brevity – download the full example for the full thing.

The example is comprised of two main objects: GoodObject and BadObject. GoodObject takes into account the fact that the underlying data set may change while BadObject does not. Both extend BaseObject in order to share some common functionality:

public abstract class BaseObject {

    protected List data = new CopyOnWriteArrayList();

    public void add(String toAdd) {
        data.add(toAdd);
    }

    public boolean remove(String toRemove) {
        boolean toReturn = data.remove(toRemove); // For logging...
        return toReturn;
    }

    public abstract void doSomething();

    public String toString() {
        StringBuilder sb = new StringBuilder();
        for (int i = 0;i < data.size();i++) {
            sb.append(data.get(i) + "  ");
        }
        return sb.toString();
    }

    protected void simulateSomething(Object obj) {
        try {
            Thread.sleep(300);
        } catch (Exception e) { /* Don't Care... */ }
    }

}

In the BaseObject the add(), remove, and simulateSomething() methods are all shared and the doSomething() method will perform an action in a good and bad way. Following our example above, the doSomething() method would do some database lookups and send an email in real life.

The BadObject implementation of the doSomething() method fails to recognize the fact that the underlying collection can be updated in a non-synchronized way:

     public void doSomething() {
        int len = data.size();
        simulateSomething(null);

        for (int i = 0;i < len;i++) {
            simulateSomething(data.get(i));
        }
    }

The failure point here comes if the remove() call is made after the call to data.size() is. Since the size is gathered then the item is removed, the for loop will inevitably try to access an elements outside the collection bounds, resulting in an exception. The same case would ring true if the collection needed to be iterated backwards for some reason.

The GoodObject implementation uses Java’s Iterator class to handle this problem:

     public void doSomething() {
        int len = data.size();
        simulateSomething(null);

        Iterator<String> items = data.iterator();
        while (items.hasNext()) {
            simulateSomething(items.next());
        }
    }

Since the iteration of the collection is no longer tied to an index, we won’t be able to hit a non-existent element and have corrected the problem.

The lesson
While this is a contrived example, it is not too far fetched. I have seen code blow up countless times because it was unable to handle something as simple as multiple windows being open. Other times I have seen weird behavior, such as each window mirroring the other because the application navigation was session bound. Likewise, I have seen code in many different organizations that did not take these things into account.

If you are writing a web application think to yourself: what is going to happen if the user opens something in a new window, thus creating two access points to the same session?

Design Patterns Refcard Released Today

You can now get a professional version of my design patterns quick reference from DZone. This newly published version has the same diagrams as my version but it is a bit more robust in that it offers a description, a “use when” section, and an implementation example for each pattern. I tried to take all the examples from real world scenarios and applications so that they would be as easy to relate to as possible.

This new version is a 7 page spread as opposed to my slimmed down 2 page version. Both are still available and, in my opinion, very useful.

There is no feasible way to incorporate code samples into anything and still consider it a “quick reference”, so you’ll need to understand at least a little bit of UML for this to be worthwhile. You can check out a quick reference for UML here and here.

This is a free download – you just have to have a DZone account.

You can also get design patterns gear (shirts, posters, etc.) here.