## Saturday, February 18, 2017

### Amusements

I ran across the following article in a math geek book I found at Barnes and Noble:

So basically, you sum up the values of ever element that is in an even position and then you take every element in an odd position and multiply it by two.  Then if it is a single digit add it to the sum.  If it is two digits, add each digit to the sum separately.  Then if the answer is evenly divisible by 10, it is a possible valid Card number.

So I came up with the following Pascal source:

For i := 1 to 16 Do
Begin
If (i Mod 2) <> 0 Then
If CardArray[i] < 5 Then
OddSum := OddSum + (CardArray[i] * 2)
Else
Begin
OddSum := OddSum + 1;
OddSum := OddSum + ((CardArray[i] * 2) - 10);
End
Else
EvenSum := EvenSum + CardArray[i];
ValidEdit.Text := IntToStr(EvenSum+OddSum);
End;

CardArray is a table with the first sixteen elements loaded with each digit of a credit card and after running it through the above source it gave me a number evenly divisible by 10.  Except one time when I got one of the digits wrong, then it was not evenly divisible by 10.  So I have to say the algorithm works.

Cool, Huh???

## Wednesday, September 23, 2015

### Followup to Fun with SQL

I increased the size of my test:

In the first graph it showed that the improvement had stopped at about 80 processes.  I'm going with that, since on one of the runs with 80 processes, Mariadb started kicking out "No connections available" error.  It only happened on one run.  The 2nd chart shows that it is still improving, I will admit that I was having trouble getting accurate numbers with that many processes running.

75 processes seems ok, anything more and I think I was not making much additional progress.

## Saturday, September 19, 2015

### Fun with SQL

MySQL/MariaDB is supposed to be one of the fastest SQL servers on the market.  But is that accurate? They do very fast selects (extracting data), but inserts and updates are considerably slower.

As a standard benchmark, I ran a program that inserted about 50,000 records into a table.  This was done on an i5 desktop running MariaDB under Linux Mint.  I broke it down into approximate 5000 record segments.  The input file of random data was created in Free Pascal and written to a Pascal flat file.

The chart is pretty much flat over this group of data. My experience suggests, though, that if I put more data through, the inserts would slow down as the file got bigger.

[Notes on Pascal: The CLI Pascal program could not read from the Pascal Flat File Format and then write to the database.  I suspect that there is a memory leak in the standard Pascal File Calls that was damaging the MySQL data structures.  I ended up just reading the data and writing it out to Standard Output.  I then piped this to MariaDB's command prompt.  Neither process used very much CPU:

0.7   2 19670 myuser    /usr/bin/mysql

Why would I even write data in the Pascal Format?  It's an obsolete file format that has been around since at least 1980.  But, it is very fast.  On my i5 laptop, it can write data out as fast as 11 million records per minute.  So, if I was in a situation where I had to capture a lot of data, I would consider it.]

In truth, writing to MariaDB is not very fast. If you had a lot of data to import, it would be like watching molasses in January.  I have worked with MySQL and it's performance isn't any better.

Both of these database servers are supposed to be multi-threaded.  In theory it should be possible to increase throughput by running multiple inserts at the same time. But is that the case?

This show the average of how many records per second were being added when the table had 50,000 records in it.  The X axis is the number of processes and the Y axis is the records being added per second.

This graph shows how long it takes in seconds to reach 20,000 records.  Again the X axis is number of processes and the Y axis is the number of seconds required.  These two graphs are basically mirror images of each other.

The conclusion:  Running multiple processes to increase throughput does work.  I have increased throughput by more than a factor of 10.  I have not hit the point of diminishing returns. Even on a home computer, MariaDB could handle 25 proccesses hitting the same table.  What's more, it's not taking that big of a hit on the CPU.

0.0   3 20744 myuser    ./LoadData2 PersonFile01      0
0.0   1 20748 myuser    ./LoadData2 PersonFile02      0
0.0   2 20754 myuser    ./LoadData2 PersonFile03      0
0.0   0 20759 myuser    ./LoadData2 PersonFile04      0
0.0   2 20765 myuser    ./LoadData2 PersonFile05      0

0.1   1 20745 myuser    /usr/bin/mysql --binary-mod   0
0.1   0 20749 myuser    /usr/bin/mysql --binary-mod   0
0.1   0 20755 myuser    /usr/bin/mysql --binary-mod   0
0.1   1 20761 myuser    /usr/bin/mysql --binary-mod   0
0.2   0 20766 myuser    /usr/bin/mysql --binary-mod   0

The above shows the load these were adding to my system while running five processes.  The first column shows the cpu load for that specific process, and the second column shows what cpu the process is running on.  Total for all 10 processes isn't much more than 0.6% spread across four CPU's.  Even running 25  processes, I was only pulling 2.3% load spread across four processors. That is only about 0.6% per CPU.

## Sunday, April 05, 2015

### MySQL Access Comparison

Just for fun, I compared MySQL access times from several different languages.  Python which is REAL popular these days, Free Pascal (An Open Source Object Pascal Compiler) and c (gcc).

Here are the results:

And from the point of view of Transactions per second

I was really kind of surprised how much better c did then Free Pascal.  I did a comparison several months ago that showed c and Free Pascal performing much closer to each other, so I assume the Free Pascal Library needs some work.  Also, I was surprised that by changing MariaDB from a my-small to my-medium I saw no improvement.  Below is a chart from the sort comparison:

## Friday, January 09, 2015

### How to Write your own Monitoring App

First the problem:

Some Linux users like to run servers and or services. You could be running Samba, MySQL, Web Servers and maybe even a Chat Server. You could be running these all on your desktop, but that wastes electricity and uses up your desktop when you may want to use it for other purposes. A primary desktop would need to be up all the time, and many desktops have a high end video card which uses as much electricity as the rest of the computer combined. Why not run them all on one or more headless servers? This could be one big processor running multiple virtual machines or on single less powerful computers (maybe even “Raspberry Pi”s) each supporting a specific service. A headless i3 could come in between 60 -70 watts, while a gaming rig can use 400 watts or more.

But how do you monitor them? You can download and install Nagios or something like that, but it could be more fun to write your own. You can design the GUI app to fit your needs, not what someone says you need. Maybe you want the app to sit in a little corner and not use up much space, or maybe you want it large on one computer all by itself.

Second, what language (Compiler or Interpreter) are you going to use.

There are many choices and in reality you can make almost any of them work, so it is going to come down to your personal choice. Okay, I like Lazarus, which is a GUI front end to the Free Pascal Compiler. It is a true compiler and generates relatively fast code. Lazarus will run circles around Python performance-wise, but in truth c++ will outperform Lazarus. Pascal is easier to write then c++ and you can have a working app in much less time. In performance tests I have seen, Free Pascal generated programs that run on average 2.5 times longer then an equal program written in c++, but Python will run about 6 times longer then an equivalent Free Pascal program. Your results may vary, in the below charts, Python took 50 times longer to run then Free Pascal, and the differences between Free Pascal and c was marginal. On the learning curve, Python is the easiest to learn (but not by much), followed by Pascal, and the hardest to learn is c++.

If you think that this charts is irrelevant to computing today. Sorting is a fairly good way to compare performance of compilers or interpreters and the source code is freely available on the internet. And lastly, I have an application that had a Procedure that updated a MySQL Table with a sort value that would change on a daily basis, and then sort the output. It had a performance issue due to the fact that MySQL updates data slowly, but can select it very quickly. I rewrote the application to load the table into RAM, update it there and then run an internal sort. This Procedure went from running minutes to milliseconds. There may come a point where it will become necessary to run a sort on data inside your program.

The Algorithm.

I knew that I wanted to start ssh processes into each server and then process the returned data. When you put the command right after the address to connect to, the session will terminate after the command is run. I have written previous programs that have started background processes that just returned the output, which could then be processed by the main application. This is relatively easy to do; you need to add Process to the uses clause and declare a TProcess and a TStringList Variable.

uses
Classes, SysUtils, Process;

Var
MyProcess: TProcesss;
MyStrList: TStringList;

And then the following needs added at the point where you want to start the process:

MyStrList := TStringList.Create;
MyProcess := TProcess.Create(nil);
MyProcess.CommandLine := 'ValidLinuxCommand';
MyProcess.Options := MyProcess.Options + [poWaitOnExit, poUsePipes];
MyProcess.Execute;
{ Do something with the String List Here }
MyStrList.Free;
MyProcess.Free;

That isn't really all that hard, and it will work as long as the output is smaller then about 2k. If it is going to be larger, we need to add more complexity. This is explained in detail at:

http://wiki.lazarus.freepascal.org/Executing_External_Programs

Initially I was thinking it would need to be a multi-threaded app, and I started researching that option. I was afraid one stuck process could freeze the whole application. The more I looked at that option, however, the more complex it sounded. Each thread could not update graphical components within the main app (that can crash the Window Manager), so they would have to update basic objects (Integers, Strings, floats and such) within the main thread and then have the main thread update the component objects and then there is added complexity of needing to use Critical Sections (this would prevent more than one thread from accessing shared object at the same time).

I then thought, well maybe I can start each subprocess and not have to wait on it finishing, and then just create a loop that tests each subprocess's state. If the subprocess has finished, process its output and then just start a replacement subprocess. No messing around with all of the multi-thread stuff.

So first of all I declared a simple record:

TProcessRec = Record
Process: TProcess;
User,
Pct: Double;
StrList: TStringList;
End;

And then declared an Array of that Record:

ProcessTable: Array[1..KKArraySize] of TProcessRec;

Now we want to set up the main loop, but first we want to prime the pump and start all the ssh processes needed:

For i := 1 to KKArraySize Do
Begin
Begin
ProcessTable[i].Process := TProcess.Create(nil);
ProcessTable[i].StrList:=TStringList.Create;
ProcessTable[i].Process.CommandLine := CmdLine(i);
ProcessTable[i].Process.Options :=                           ProcessTable[1].Process.Options +
[poUsePipes];
ProcessTable[i].Process.Execute;
end;
end;

I am just looping through my Array and checking to see if there is an address defined, if so start a ssh. Note, I removed the poWaitOnExit from ProcessOptions because we don't want to stop and wait for each process to finish. Instead, we will test the subprocess in the next step. Also note that I am calling a function CmdLine to build the Linux command I want to run. Below is that function:

Function CmdLine(i: Integer): String;
Begin
CmdLine := 'sshpass -p "' + ProcessTable[i].Password +
'"   ssh ' + ProcessTable[i].User + '@' +
ProcessTable[i].Addr + ' sar -u 1 1;free';
end;

This is pretty simple, it just uses the Array to fill in user, Password and Address. As you can also note, I am using sshpass to pass the password to ssh and am also using the sar utility. sshpass can be found in most repositories and sar is part of the sysstat monitoring utilities which also can be found in most repositories. I found sar gave cpu utilization numbers very close to what top gives, so that is why I ended up using it. sar or whatever utility you end up using will have to be installed on all of the servers you are monitoring.

Now finally the main loop:

EndTime := Now + EncodeTime(0, 0, 5, 0); // 5 Seconds
While Now < EndTime do
Begin
Application.ProcessMessages;
Sleep(10);
End;
For i:=1 to KKArraySize Do
Begin
Begin
If ProcessTable[i].Process.Running = False then
Begin
(ProcessTable[i].Process.Output);
For j:=0 to ProcessTable[i].StrList.Count-1 Do
Begin
If copy(ProcessTable[i].StrList[j],1,7)
= 'Average' then
Begin
Sum :=
SumProcessorStats
(ProcessTable[i].StrList[j]);
PctStr := Format('%5.2f', [Sum]);
ProcessTable[i].Pct := Sum;
End;
End;
[ProcessTable[i].Pct]);
ProcessTable[i].StrList.Clear;
ProcessTable[i].Process.CommandLine := CmdLine(i);
ProcessTable[i].Process.Options :=
ProcessTable[1].Process.Options +
[poUsePipes];
ProcessTable[i].Process.Execute;
End;
End;
End;

Well, that is a mouthful! This is all in a Repeat ... Forever Loop, well, almost a forever loop because I do have a way to break it. The first While Loop, is just a means to get the App to wait 5 seconds before checking on the status of all of our background processes. The first two If statements seem silly and seem like they should be combined, but ... Since I only initialized a Process in the array where the Addr field had actual data in it, if we ran the “Process.Running” statement for an uninitialized Process variable, the app would have had a memory error and would have crashed.

(ProcessTable[i].Process.Output);

―— just loads the output from our ssh into our TStringList, which was declared in our Array. A TStringList is actually just an array of Strings and the output of the sar utility has the word Average in it. So the following block of code just sums up the output from the sar utility. I summed up user, nice, and system values listed from the sar utility. You may decide on a different strategy.

For j:=0 to ProcessTable[i].StrList.Count-1 Do
Begin
If copy(ProcessTable[i].StrList[j],1,7) = 'Average' then
Begin
Sum := SumProcessorStats (ProcessTable[i].StrList[j]);
PctStr := Format('%5.2f', [Sum]);
ProcessTable[i].Pct := Sum;
End;
End;

SumProcessorStats is a function and it does the adding of the sar outputed fields together.

Function SumProcessorStats(s: String): Double;
Var
UsrStr, NiceStr, SysStr: String;
Sum, Dbl: Double;
begin
UsrStr := copy(s,24,6);
NiceStr := copy(s,34,6);
SysStr := copy(s,44,6);
Sum := 0.0;
Dbl := StrToFloat(UsrStr);
Sum := Sum + Dbl;
Dbl := StrToFloat(NiceStr);
Sum := Sum + Dbl;
Dbl := StrToFloat(SysStr);
Sum := Sum + Dbl;
SumProcessorStats := Sum;
end;

Finally, since the previous subprocess had finished, this last piece starts a new subprocess:

ProcessTable[i].StrList.Clear;
ProcessTable[i].Process.CommandLine := CmdLine(i);
ProcessTable[i].Process.Options := ProcessTable[1].Process.Options +
[poUsePipes];
ProcessTable[i].Process.Execute;

Security is an issue here. The addresses, users and passwords could be hard-coded into your App. If somebody broke into your system though, they could get all of your passwords by looking at your source code. An improvement would be to have all that info in MySQL Table and have the Operator prompted for the password to access that table. Even better would be to have the data all encrypted and have the Operator prompted for a decryption key.

At some point, we may want to start sending e-mails out when certain conditions occur. Like if one of the servers is maxed out for several minutes, or if a server becomes unresponsive. Sending out e-mails isn't that hard to add. There is a library that can be downloaded for free called Synapse (http://wiki.freepascal.org/Synapse). You would need to download the library and add the following libraries to uses clause:

synautil, synacode, blcksock, pop3send, smtpsend

I copied the whole Synapse Library into the directory containing my app, and then whatever Lazarus needed, it would compile. The content of the e-mail would need to be added to a tstringlist and then to send the e-mail, use something similar to the following:

'EmailServer', MsgStringList) Then
ShowMessage('E-Mail Didn''t Work');

Performance

On an Intel i5, the main thread uses about 6% of the CPU on one core. Very acceptable.

## Thursday, November 06, 2014

### 2014 Midterms

Ok, I am a Republican. A liberal one, but still a Republican.  First of all I  didn't trust the Media and I didn't trust the Media's Polling.  Some of the Polling Firms I was throwing out like Survey/USA and sometimes Rasmussen.  I can show that certain Polling Firms lean further to the left then mean average.  Ok, Rasmussen's poll data on Presidential Approval Rating was way off from companies like Gallup, but their State Senate Polls weren't as bad.

So I flew by my own assumptions:

1. The Media Organisations were for the Democrats and their polling was trying to  manipulate the Electorate instead of inform them.  They were party activists instead of being reporters.

2. They were not estimating turnout correctly. I took the conservative assumption that were off by 5%.  So I adjusted the Democrat numbers down by 2.5% and upped the Republican numbers equally.

3. The Media was using too long of a range of polls.  A Poll taken a month ago was no longer valid, in my opinion.

4. I also thought the Electorate was very angry and election would have a very close correlation to 1978!

So I wrote a little app that used my assumtions and did some calculations.  It's crude but gave me some very good estimates.

This run was ran Tuesday morning, just before the election:

So my program estimated a pickup  for the Republicans of 10 Senators, not bad. I think I did better then the Pundits.  But I kept my mouth shut living in a multi-party family.  Also, my assumptions were incorrect in 2012.

As far as the economy goes, it will not improve until ACA is removed.  My proof is not perfect, it is statistical.  If you take all of the recesions in American history and take an average and standard deviation of their length, you get something like:

Average:  21.5 months
StdDev:    15.75

So, virtually all recessions should be over in (21.5) + (3 * 15.75) = 68.75 months.
That is just under 6 years.  If you don't understand why this is valid, it is because when you have a normal Distribution curve about 68% of all data will occur Plus or minus 1 Standard Deviation of the Mean (Average), 95% of all data will occur plus or minus two Standard Deviations of mean.  And 99.97% of all data will occur within three Standard Deviations of Mean!

Professional Economists do use this and their published results are actually a little more conservative, with a slightly smaller mean and slightly smaller standard deviation.  Which makes my argument just that much stronger.

So why are we still in a recession???  I know we technically we are not in one, but we arn't in a recovery either.  Unemployment is still too high and the only ones doing well, are the top couple of percent.  This is not normal.  The answer is the cost of the ACA is adding about \$15,000 dollars to the employers cost for a full time employee per year.  When small business persons run their numbers through their spreadsheets they can't make it work and therefore, they don't risk their money and there is no business expansion.

I would also argue that if the Democratic Party wants to be viable again, they need to drop the ACA.  If they have to over ride a Presidential veto, do it and do it quickly.  If you let the clock run out and the Republicans end up doing it after the 2016 elections they will get all of the credit for an improving economy.