March 2008 - Posts

What's Wrong With This Code (#19)

Leroy was shocked when the source code appeared. It was familiar yet strange, like an old lover's kiss. The code was five years old – an artifact of Leroy's first project. Leroy slowly scrolled through the code and pondered his next move. It wasn't a bug that was bothering Leroy – there were no race conditions or tricky numerical conversions. No performance problems or uncaught error conditions. It was all about design …

public class BankAccount
{
    
public void Deposit(decimal amount)
    {
        _balance += amount;
        LogTransaction(
"Deposited {0} on {1}", amount, DateTime.Now);
    }

    
public void Withdraw(decimal amount)
    {
        _balance -= amount;
        LogTransaction(
"Withdrew {0} on {1}", amount, DateTime.Now);
    }

    
public void AccumulateInterest(decimal baseRate)
    {
        
decimal interest;

        
if (_balance < 10000)
        {
            interest = _balance * baseRate;
        }
        
else
        {
            interest = _balance * (baseRate + 0.01);
        }
        LogTransaction(
"Accumulated {0} interest on {1}", interest, DateTime.Now);
    }

    
void LogTransaction(string message, params object[] parameters)
    {
        
using(FileStream fs = File.Open("auditlog.txt", FileMode.OpenOrCreate))
        
using(StreamWriter writer = new StreamWriter(fs))
        {
            writer.WriteLine(message, parameters);
        }
    }

    
public decimal Balance
    {
        
get { return _balance; }
        
set { _balance = value; }
    }

    
decimal _balance;
}

"Times have changed, and so I have, fortunately", Leroy thought to himself. "And so will this code…"

To be continued…

posted by scott with 33 Comments

Custom Aggregations In LINQ

Aggregate is a standard LINQ operator for in-memory collections that allows us to build a custom aggregation. Although LINQ provides a few standard aggregation operators, like Count, Min, Max, and Average, if you want an inline implementation of, say, a standard deviation calculation, then the Aggregate extension method is one approach you can use (the other approach being that you could write your own operator).

Let's say we wanted to see the total number of threads running on a machine. We could get that number lambda style, or with a query comprehension, or with a custom aggregate.

var processes = Process.GetProcesses();

int totalThreads = 0;

totalThreads = processes.Sum(p => p.Threads.Count);

totalThreads = (
from process in processes
                
select process.Threads.Count).Sum();            

totalThreads =
     processes.Aggregate(
            0,                                  
// initialize
            (acc, p) => acc += p.Threads.Count, // accumulate
            acc => acc                          // terminate
      );

This particular overloaded version of Aggregate follows a common pattern of "Initialize – Accumulate – Terminate". You can see this pattern in extensible aggregation strategies from Oracle to SQLCLR. The first parameter represents an initialization expression. We need to provide an initialized accumulator – in this case just an integer value of 0.

The second parameter is a Func<int, Process, int> expression that the aggregate method will invoke as it iterates across the sequence of inputs. For each process we get our accumulator value (an int), and a reference to the current process in the iteration stage (a Process), and we return a new accumulator value (an int).

The last parameter is the terminate expression. This is an opportunity to provide any final calculations. For our summation, we just need to return the value in the accumulator.

StdDev

Now, let's compute a more thorough summary of running threads, including a standard deviation. Although we could get away with a simple double accumulator for stddev, we can also use a more sophisticated accumulator to encapsulate some calculations, facilitate unit tests, and make the syntax easier on the eye.

class StdDevAccumulator<TSource>
{        
    
public StdDevAccumulator(IEnumerable<TSource> source,
                            
Func<TSource, double> avgSelector)
    {
        SampleAvg = source.Average(avgSelector);
        SampleCount = source.Count();
    }

    
public StdDevAccumulator<TSource> Accumulate(double value)
    {
        TotalDeviation +=
Math.Pow(value - SampleAvg, 2.0);
        
return this;
    }

    
public double ComputeResult()
    {
        
if (SampleCount < 2)
        {
            
return 0.0;
        }
        
return Math.Sqrt(TotalDeviation / (SampleCount - 1));  
    }

    
public double SampleAvg { get; set; }
    
public int    SampleCount { get; set; }
    
public double TotalDeviation { get; set; }
}

Put the accumulator to use like so:

var processes = Process.GetProcesses();

var summary = new
    {
        TotalProcesses = processes.Count(),
        TotalThreads = processes.Sum(p => p.Threads.Count),
        MinThreads = processes.Min(p => p.Threads.Count),
        MaxThreads = processes.Max(p => p.Threads.Count),
        StdDevThreads = processes.Aggregate(    
                
new StdDevAccumulator<Process>(processes, p => p.Threads.Count),
                (acc, p) => acc.Accumulate(p.Threads.Count),                    
                (acc)    => acc.ComputeResult()
        )
    };

posted by scott with 6 Comments

And Equality for All ... Anonymous Types

Given this simple Employee class:

public class Employee
{
    
public int ID { get; set; }
    
public string Name { get; set; }    
}

How many employees do you expect to see from the following query with a Distinct operator?

var employees = new List<Employee>
{
    
new Employee { ID=1, Name="Barack" },
    
new Employee { ID=2, Name="Hillary" },
    
new Employee { ID=2, Name="Hillary" },
    
new Employee { ID=3, Name="Mac" }
};

var query =
        (
from employee in employees        
        
select employee).Distinct();

foreach (var employee in query)
{
    
Console.WriteLine(employee.Name);
}

The answer is 4 – we'll see both Hillary objects. The docs for Distinct are clear – the method uses the default equality comparer to test for equality, and the default comparer sees 4 distinct object references. One way to get around this would be to use the overloaded version of Distinct that accepts a custom IEqualityComparer.

Let's try the query again and project a new, anonymous type with the same properties as Employee.

var query =
  (
from employee in employees                            
  select new { employee.ID, employee.Name }).Distinct();

That query only yields three objects – Distinct removes the duplicate Hillary! How'd it suddenly get so smart?

Turns out the C# compiler overrides Equals and GetHashCode for anonymous types. The implementation of the two overridden methods uses all the public properties on the type to compute an object's hash code and test for equality. If two objects of the same anonymous type have all the same values for their properties – the objects are equal. This is a safe strategy since anonymously typed objects are essentially immutable (all the properties are read-only). Fiddling with the hash code of a mutable type gets a bit dicey.

Interestingly – I stumbled on the Visual Basic version of anonymous types as I was writing this post and I see that VB allows you to define "Key" properties. In VB, only the values of Key properties are compared during an equality test. Key properties are readonly, while non-key properties on an anonymous type are mutable. That's a very C sharpish thing to do, VB team.

posted by scott with 1 Comments

Inner, Outer, Let's All Join Together With LINQ

The least intuitive LINQ operators for me are the join operators. After working with healthcare data warehouses for years, I've become accustomed to writing outer joins to circumvent data of the most … suboptimal kind. Foreign keys? What are those? Alas, I digress…

At first glance, LINQ appears to only offer a join operator with an 'inner join' behavior. That is, when joining a sequence of departments with a sequence of employees, we will only see those departments that have one or more employees.

var query =
  from department in departments
  join employee in employees
      
on department.ID equals employee.DepartmentID
  select new { employee.Name, Department = department.Name };

After a bit more digging, you might come across the GroupJoin operator. We can use GroupJoin like a SQL left outer join. The "left" side of the join is the outer sequence. If we use departments as the outer sequence in a group join, we can then see the departments with no employees. Note: it is the into keyword in the next query that triggers the C# compiler to use a GroupJoin instead of a plain Join operator.

var query =
  from department in departments
  join employee in employees
      
on department.ID equals employee.DepartmentID
    
  into employeeGroup
  select new { department.Name, Employees = employeeGroup };

As you might suspect from the syntax, however, the query doesn't give us back a "flat" resultset like a SQL query. Instead, we have a hierarchy to traverse. The projection provides us a department name for each sequence of employees.

foreach (var department in query)
{
    
Console.WriteLine("{0}", department.Name);
    
foreach (var employee in department.Employees)
    {
        
Console.WriteLine("\t{0}", employee.Name);
    }
}

Flattening a sequence is a job for SelectMany. The trick is in knowing that adding an additional from clause translates to a SelectMany operator, and just like the outer joins of SQL, we need to project a null value when no employee exists for a given department – this is the job of DefaultIfEmpty.

var query =
  from department in departments
  join employee in employees
      
on department.ID equals employee.DepartmentID
    
  into employeeGroups
  from employee in employeeGroups.DefaultIfEmpty()
  select new { DepartmentName = department.Name, EmployeeName = employee.Name };

One last catch – this query does work with LINQ to SQL, but if you are stubbing out a layer using in-memory collections, the query can easily throw a null reference exception. The last tweak would be to make sure you have a non-null employee object before asking for the Name property in the last select.

posted by scott with 9 Comments

Mashups with SyndicationFeed and LINQ

I was experimenting with the new SyndicationFeed class in 3.5 earlier this year and devised a mashup LINQ query:

string[] feedUrls = { "http://www.OdeToCode.com/blogs/scott/rss.aspx",
                      
"http://www.pluralsight.com/blogs/mainfeed.aspx",
                      
"http://feeds.feedburner.com/ScottHanselman"    
                    };

var items =
    
from url in feedUrls
       
let feed = SyndicationFeed.Load(XmlReader.Create(url))
    
from item in feed.Items
    
where item.PublishDate > DateTime.Now.AddDays(-30)
    
orderby item.PublishDate descending
    select item;

// display the most recent 15 items
foreach (SyndicationItem item in items.Take(15))
{
    
Console.WriteLine("{0} : {1}",
        item.PublishDate.Date.ToShortDateString(),
        item.Title.Text);
}

The code is able to filter and sort RSS items from an arbitrary number of blogs with a 6 line query expression. I was thinking of this code when I ran across Scott Hanselman's Weekly Source Code 19 – LINQ and more What, Less How. Scott's reader David Nelson had the following observation:

I disagree with Siderite, in that I think the LINQ example is more readable than the iterative example; however, as has been pointed out, it leaves no room for error handling or AppDomain transitions. This is a problem with LINQ in general; in trying to make everything very compact, it leaves too little room to maneuver.

The LINQ query I'm using isn't production code. If just one blog is down and the XmlReader throws an exception, the entire operation is borked. One solution is to wrap the feed reading into a method that uses exception handling and returns an empty SyndicationFeed in case of an exception - then invoke the method from inside the query. Could anything else go wrong? Sure - one null PublishDate on an item and again we'd be borked. Bullet-proofing a LINQ query might take some work, especially when dealing with third party types.  

As LINQ moves us into the "What" instead of the "How", it might be harder to see these types of error scenarios. LINQ is a fantastic technology, but like everything in software, it is a good idea to look the gift horse in the mouth. 

posted by scott with 4 Comments

Talks You Won’t See At the Local Code Camp

The Lost Art of TSR Programming
Abstract
: Return to the glory days of DOS 2.0 and INT 21h as we write a simple Terminate and Stay Resident application using the latest software development techniques. We will construct our x86 assembler code using test driven development and mock extended memory managers.

Why Am I Here On A Saturday?
Abstract: Because even if you weren't here, you'd still be at the computer. Don't think you'd be doing chores at home, like dusting off the entertainment center, because chores are boring.

Life of a Gnat
Abstract: This session has nothing to do with GNU software, but will describe (in excruciating detail) the journey of the common fungus gnat from egg to adulthood. Pictures of mating swarms may not be appropriate for younger attendees.

P.S. In all seriousness, the spring code camps are coming to the Mid-Atlantic and the topics are far better than the ones presented above.

CMAP Code Camp – April 12th in Columbia, MD

NoVa Code Camp South – March 29thth in Woodbridge, VA

Richmond Code Camp – April 26th

posted by scott with 3 Comments

Visitors and Multiple Dispatch

The visitor pattern is an elegant solution to a specific class of problems, but also comes with drawbacks in mainstream programming languages. There are various techniques one can us to minimize the drawbacks, however. This weekend I found some old but good articles harnessing the power of visitor while removing some of the pain:

Both Brad and Kzu use reflection to effectively achieve multiple dispatch. The typical multiple dispatch mechanism used to implement the visitor pattern is a drag on maintenance and forces the pattern implementation to bleed into the visitor's targets – Brad's post shows all the gory details before he goes on the show a solution using reflection. Multiple dispatch isn't just a problem in C#, C++, and Java – it's also a problem in dynamic languages like Python and Ruby. Only a handful of languages inherently use multiple dispatch, or multimethods, with Common Lisp being the most notable.

I tried a dispatch mechanism using Expression<T> today, but (in case you don't want to read any further) did not get a satisfactory result. If you keep reading, maybe you can think of something I missed. 

Given these classes:

public class Employee
{
    
public string Name { get; set; }
}

public class Manager : Employee
{
}

How can you write a class like the following?

public class EmployeeReport
{
    
public void Generate(IEnumerable<Employee> employees)
    {
        
foreach (Employee employee in employees)
        {
            Visit(employee);
        }
    }

    
void Visit(Employee employee)
    {
        
// employee specific work ...
        Console.WriteLine("Visiting employee {0}", employee.Name);
    }

    
public void Visit(Manager manager)
    {
        
// manager specific work ...
        Console.WriteLine("Visiting manager {0}", manager.Name);
    }
}

The compiler will lay down code to invoke the Visit method that accepts an Employee reference, even when we really are referencing a Manager object at runtime. We need some mechanism that will thunk us into the best method based on the runtime type of the parameter, preferably without using a switch statement. A naïve try might look like the following:

public void Generate(IEnumerable<Employee> employees)
{
    
foreach (Employee employee in employees)
    {
        DispatchVisit(employee);
    }
}

void DispatchVisit(Employee visitee)
{
    
Expression<Action<Employee>> expression = e => this.Visit(e);
    expression.Compile().Invoke(visitee);
}

Even though this code is generating IL at runtime via Compile, it still invokes the wrong method when we reach a Manager object. At compile time the compiler generates an expression tree to invoke the Visit accepting an Employee reference. We could go into the expression and make modifications to the Body before we call Compile, or we could build an expression tree manually. Either way, we need to pick out the exact method we need at runtime using the type of the parameter and a call to GetMethod:

void DispatchVisit(Employee visitee)
{
    
MethodInfo methodInfo = this.GetType().GetMethod(
                                
"Visit", new Type[] { visitee.GetType() });
    
    
ParameterExpression e = Expression.Parameter(visitee.GetType(), "e");
    
LambdaExpression expression =
        
Expression.Lambda(
            
Expression.Call(
                
Expression.Constant(this, typeof(EmployeeReport)),
                methodInfo,
                e),
             e);

    
    expression.Compile().DynamicInvoke(visitee);
}

This code works, because we are now getting a MethodInfo object that will reference Visit(Manager employee) when we have a manger object at runtime. But, the code is starting to look just as hairy as any other code generation technique, and we haven't even tried to address caching the compiled expression for performance, or moving up to an Expression<T>. 


posted by scott with 13 Comments

Travelogue: India

Outside the Ella Compass Suites

I spent the last week of February in Hyderabad, India. This was my first trip to India, and I thought I'd share some experiences.

The Flight

I flew from Washington D.C. to Hyderabad on Qatar Airlines. The longest leg, between D.C. and Doha, was on a Boeing 777-300ER – a long range jet with the largest engines in aviation history. The business class configuration on the plane made the time pass with relatively little agony. The lay-flat seats gave me 12 hours of sleep, and I used the on-demand entertainment system to pass the time with some new movies (mostly on the way home):

I highly recommend Qatar Airlines. The business lounge in Doha is comfortable, and I had time to grab a shower in the marbled and well-stocked washrooms. Service is second none, and the food is the best I've ever had in the air – smoked salmon, brie, champagne, filet mignon, prawns, and Godiva chocolates. The endless stream of food and drink was served by a team of young models – all smiling and very easy on the eye.

The downside to this experience is that my next domestic flight will feel worse than ever. Satan himself recruits airline executives from the United States to mange the first circle of hell.

The Airport

Golcanda Fort

I just have to tell you about the Hyderabad airport experience. My flight, like most International flights to India, arrived in the middle of the night. You'd think an airport would be empty at 3 in the morning, but you'd be wrong. There was a 30 minute wait to get through immigration, which isn't too bad, but it was amusing. Lines would close and open arbitrarily, forcing people to shuffle and maneuver into new lines. There were things that happened in this shuffle that would have resulted in bloodshed had they happened in, say, Newark, N.J. But this was India, and personal space can't afford to exist.

At one point a new line opened, but the man behind the counter shouted "women only!" So I stayed in the long line of forlorn looking men and together we stared at this long line of women. For a brief moment I was hoping this would turn into one of the scenes I've watched in Bollywood movies. You know – the line of women start singing and dancing, and then the line of men start singing and dancing, and before you know it, the entire cast is embracing and gyrating out the door to freedom. I did some light stretching to warm up in case the music started, but nothing materialized.

The arrivals area outside was a sea of humanity. People were stacked 7 deep waiting for loved ones to arrive, or perhaps they were there just to watch the spectacle. I found my driver quickly, and waited just a few minutes for him to bring up the car. I listened to the jangle of excited voices. Watched the flashing and weaving of motorbikes. Felt a tug from a young lady begging. Smelled diesel fumes from rickshaw motors. Tasted dirt in the air.

India welcomed me with an assault on all senses.

Hyderabad

Fruit stand outside Hyderabad

Most of my time was spent in HITECH city, slightly northwest of Hyderabad proper. The Microsoft campus there is eerily similar to the campus in Redmond, right down to the color of the glass and signage. The only major difference is the unrelenting source of heat and light that hangs in the sky over the Hyderabad campus. Redmondites might have heard of it – it's called the sun. I was on campus everyday teaching for Pluralsight

I did get a chance to venture into the city, and also to the nearby Golconda fort. The fort is built on a hill of granite and dates back to 1143. The outer wall has a circumference of 7km. Inside are iron studded gates, mosques, temples, ornate stonework, and a blend of Hindu and Muslim architecture. The gates also featured an impressive acoustical effect. A person can stand under the carefully designed dome and clap. That clap can be heard at the top of the mountain over one kilometer away! Hundreds of years ago, the claps would signal the king's arrival, or an enemy's encroachment.

Riding in the streets of Hyderabad I witnessed only a glimpse of the diversity that is in India. A women wrapped in a brightly colored sari stands alongside a Muslim women covered in a plain black jilbab. An old farmer with a handful of vegetables stands alongside a young businessman with his cell phone. It's a dizzying mix of life in the swarming streets of an old city. And yes, the driving in India is everything you've ever heard it is – and quite possibly worse than you ever imagined.

The road is lawless, but the people are polite. I'd say hospitality in India is second to none. Everyone I talked to was warm, inviting, and eager to share a story. It was entertaining to sit down at a meal and listen to conversations. Some talks would begin in English then suddenly veer into a local tongue. Another blink of the eye - and it's English again. Fortunately, someone was always willing to translate for me.

Finally, I'd be remiss if I didn't tell you about the food. I've developed a passion for Indian food over the last 5 years, and was delighted to find a mix of both northern and southern cuisines in this centrally located city. I had vadas, idlis and somber during breakfasts. Dosas and biryanis ruled for lunch, while paneer dishes (including my favorites palek paneer and mutter paneer), and a variety of curries and tandoori dishes were fair game for dinner. I was as close to achieving a gastronomic orgasm as one can get (although I'll be honest – by the end of the week I had this undeniable craving for a New York style pizza).

Suddenly, I Was Gone

Downtown Hyderabad

One week isn't a great deal of time to spend somewhere so far from home. I was just getting over the jetlag when I stayed up to catch a 5 a.m. flight home. In the airport, I was reading the local newspaper when I came across some funny ads:

Increase your bust line with a new herbal mix developed by scientists in the United States! Start looking more attractive with a bigger bust line in 3 months! Call now!

And:

Increase your height using a new lotion developed in the United States! Don't be short any longer – just apply this lotion to the soles of your feet once a day for six months! Order now!

Funny, because I think I've seen the same rip-offs advertised in U.S. newspapers, only the lotions are developed by wise monks who live in the misty mountains of India.

I guess it's true the world over – countries far away from your own carry an aura of mystique and magic. Anything is possible over there!

I was fortunate enough to see the mystique of India up close.

posted by scott with 13 Comments

Extension Methods for Profit

Some say we are living in the information age, but I say this is the advertisement age. Marketers strive to cover every square inch of the planet and outer atmosphere with slogans and promotions. We have ambient advertising and advergames, human billboards and celebrity branding.

My prediction is that marketers will become more aggressive in placing advertisements directly inside the software used by information workers who are laden with disposable income. Microsoft is well positioned for the next wave of advertising with the addition of extension methods in C# and Visual Basic. Just imagine the number of eyeballs that will see the following methods in an Intellisense window.

namespace System
{
    
public static class CurrentAdvertisements
    {
        
public static void TheJoyOfPepsi(this object cola)
        {          
        }

        
public static void BuiltFordTough(this object truck)
        {
        }

        
public static void EatAtHooters(this object notForTheFood)
        {
        }
    }
}

Big bucks are waiting there.

The only question is – will advertisers be content to stick with the traditional camel casing of method names, or will they pay a premium for Underscored_Product_Placements?

posted by scott with 13 Comments