Tuesday, May 29, 2012

My Letter to the Editor of Linux Format

Linux Format published a series on writing Assembly Language Applications.  I personally thought this was obscure and the author was showing off.  With the speed of modern computers I found the subject ridiculous, so I wrote the following letter to the Editor:


In regards to Mike Saunders column on Assembly.
I find the whole concept strange and obscure. When you take the speed of today's computers into account, there is virtually no need to reach for Assembly. For those who do, it is usally a personal choice and not a technical one.
If you find that your application has a performance problem it can usually be fixed by a more efficient algorithm. Next, I would look at changing the most CPU hungry routines with c. If that doesn't work go back and look at the algorithm.
There are some real problems writing apps in Assembly,
      1. Maintenance is difficult and expensive.
      2. Moving the app from one processor to the next often requires a complete rewrite.
      3. It takes many times longer to write in Assembly and requires specialist knowledge. So the programming time costs more per hour and the number of hours are higher.
      4. He was writing in 32 bit when most processors today are 64 bit, running 32bit code on a 64 bit machine requires it to be ran in a interpreted mode and actually performs many times worse then if it had been compiled in c for a 64 bit processor.

Let us remember that virtually the entire Linux OS is written in c. I believe there are still some low level routines still written in Assembly, but the vast majority is written in c. If an OS as complex as Linux can be written in c and with good performance, the argument for writing in Assembly seems weak at best.
Terry Haimann
Des Moines, Iowa, USA

Saturday, January 14, 2012

Software Tuning for fun

If you asked me 15 years ago what my primary programming language was, I would have answered “c”. But it is very hard to produce a gui app in c, and shortly there after I started using Delphi. Delphi with a add on MySQL Database Library called MySQLDAC makes a powerful environement allowing the quick development of useful database applications.

As I became more and more interested in Linux (I have always liked obscure OS's since I used to use OS/2), I became aware of a Delphi for Linux (and other OS's) called Lazarus.

In reality it is two products in one, the front end app is indeed called Lazarus and is a GUI Application Development Environment very similar to Delphi, but behind it sits Free Pascal. Free Pascal is a Object Pascal compiler that is almost identical to the Object Pascal that Borland came up with all those years ago. So in effect you write your app in Lazarus and then Lazarus hands it over to Free Pascal to acually compile it.

So if you want to write a GUI App, you can use Lazarus and if you want to create a non GUI App, such as something you can use in a Cron App (Cron runs tasks in the background in Unix enviroments), it can be written in straight Pascal. It is easier to write then c and everything seems wonderful in the world.

Now my primary Desktop is a HP with a AMD 64 bit Quad Processor. More specifically it is a Phenom 9150e. And at the time I had a Dual Core Pentium (32 bit) cpu which I used for traveling and when I am otherwise away from home. I found a copy of the Dhrystone benchmark written in Pascal and on a lark ran it on the laptop and then later for comparison purposes on the desktop. The laptop blew the desktop away. How was this possible, the desktop had a 64 bit cpu while the laptop was running an older 32 bit cpu.

Well, AMD Processors have their own instruction sets that is similar to the intel instruction set but NOT the same. If an AMD processor sees an Intel instruction it runs it in a interpeted mode, which gives correct results but runs much slower. Well the Free Pascal Compiler seems to generate only Intel instructions and there is very little tuning that can be done. There were a couple of recommendations given on the Lazarus Forum, but it made virtually no difference.

I then found the Dhrystone Benchmark written I c, compiled and ran it on both machines and the run times were much as I would have expected, with it running much faster on the desktop.

In truth with the speed of modern computers you really aren't going to notice the slowdown that much unless you throw a lot of data at it. The moral of this part of the story is if want to run Free Pascal and want it to run fast, get an Intel Processor. To be fair to Lazarus and Free Pascal, they are a very small project that is only supported by hobbyists alone.

A few weeks ago I decided to have a little refresher course in c, so I wrote some programs that created some large MySQL tables and then did some crunching with them. MySQL was actually doing most of the work, but the c programs were issuing the queries to be ran. Well you can tune c programs written using the gcc compiler a lot better then Free Pascal. But gcc, is supported by large corporations (like IBM and HP) and has a lot more money to work with.

On the desktop I tried compiling it three different ways:

  1. No tuning: gcc Program.c -o Program `mysql_config --cflags –libs`

  2. Using mtune switch: gcc -mtune=athlon64 Program.c -o Program `mysql_config --cflags –libs`

  3. Using the march switch: gcc -march=native Program.c -o Program `mysql_config --cflags –libs`


I ran the programs multiple timmes giving me a average runtime:



No Tuning

mtune=athlon64

march=native

Wall Time

6.2 seconds

6.19 seconds

6.22 seconds

User App time

0.75 seconds

0.75 seconds

0.77 seconds






The gcc web site recommends using “-march=native” which in this case gave the worst performance, but not by much. I did just as well by not tuning at all!!! For the record, wall time is the actual runtime as seen on a clock. I got these results by running the script using the Unix time command.

But, I have since upgraded my laptop to a Intel i5 cpu and the results are a little different.



No Tuning

mtune=native

march=native

Wall Time

5.68 seconds

4.78 seconds

4.62 seconds

User App time

1.08 seconds

0.75 seconds

0.74 seconds






So in this case, tuning seems to have helped. In both of these tables most of the wall times appears to be spent waiting on the MySQL Server to finish running the queries.