Saturday, August 06, 2022

I am, therefore I GREP!

I had a MariaDB table column I needed to grep against a file. Grepping one or two values against a file is ok and kind-of cool. But not hundreds, there has to be a better way. And I came up with one. 

I wrote a program that would do it all for me. And it's not that big. Of course, it being me I wrote it in Free Pascal. No need to make this a GUI, so Lazarus won't be necessary. 

First of all the source code:

Program GreppingGoodTime;
Uses SysUtils, db, sqldb, mysql80conn, Process, Classes; 
{$I /home/user/Documents/MyFreePascalDir/MysqlConnLib.inc}
{$I /home/user/Documents/MyFreePascalDir/MySecretConstants.inc}

Var
	MyConn:				TSqlConnector;
	MyTran: 			TsqlTransaction;
	MyQry:				TSqlQuery;
	Value:				String;
	MyProcess: 			TProcess;
  	MyStrList: 			TStringList;
  	i:				LongInt;

Begin
	MyConn := CreateConnection('MySQL 8.0', '127.0.0.1', 'MyDatabase', MyUser, MyPass);
	CreateTransaction(MyConn, MyTran);
	MyQry := GetQuery(MyConn, MyTran);
	MyQry.SQL.Text := 'select * from MyTable where MyIntColumn > 0';
	MyConn.Open;
	MyQry.Open;
	MyProcess := TProcess.Create(nil);
	MyProcess.Executable := 'grep'; 
	MyProcess.Options := MyProcess.Options + [poWaitOnExit, poUsePipes];
	MyStrList := TStringList.Create;
	If MyConn.Connected Then
		Repeat
			Value := MyQry.FieldByName('MyStrColumn').AsString;
			WriteLn('grep ', Value, ' MyFile.csv');
	  		MyProcess.Parameters.Add(Value); 
	  		MyProcess.Parameters.Add('MyFile.csv');
	  		MyProcess.Execute;
	  		MyStrList.LoadFromStream(MyProcess.Output);
	  		If MyStrList.Count > 0 Then
	  			For i := 0 To MyStrList.Count - 1 Do
	  				WriteLn(MyStrList[i]);
	  		MyStrList.Clear;
	  		MyProcess.Parameters.Clear;
			MyQry.Next;
		Until MyQry.Eof;
	MyStrList.Free;
	MyProcess.Free;
	MyQry.Close;
	MyConn.Close;
	MyQry.Free;
	MyConn.Free;
	MyTran.Free;
End.

 

The second line starting with "Uses" are just how Pascal includes libraries. Lines four and five are some files I have included.  The second one of these is just where I declared some constants, mainly the database user and password.  

The first include file has the MySQL connection routines and that file is listed below:

function GetQuery(Conn: TSqlConnector; Tran: TSqlTransaction) : TSQLQuery;
  var MyQuery : TSQLQuery;
begin
  MyQuery := TSQLQuery.Create(Tran);
  MyQuery.Database := Conn;
  MyQuery.Transaction := Tran;
  GetQuery := MyQuery;
end;

Procedure CreateTransaction(Conn: TSqlConnector;Var Tran: TSqlTransaction);
Begin
  Tran := TSQLTransaction.Create(Tran);
  Tran.Database := Conn;
End;
	
Function CreateConnection(CType, HostAddr, Database, User, Password: String): TSqlConnector;
Var
	Conn:	TSqlConnector;
Begin
	Conn := TSQLConnector.Create(nil);
	Conn.ConnectorType 	:= CType;
	Conn.Hostname 	:= HostAddr;
	Conn.DatabaseName  	:= Database;
	Conn.UserName		:= User;
	Conn.Password		:= Password;
	CreateConnection 	:= Conn; 
End;


Starting at the Begin in the main program, the first six lines connects to the database and opens the Query.  The next four lines setup the TProcess variables which are used to run external processes, this is best explained by looking at he following web page

https://wiki.freepascal.org/Executing_External_Programs

Look at the section on "TProcess", both the simple example and the improved example.  I use this all of the time.  Then I just create a small "REPEAT" loop that steps through the table, the first "writeln", just prints out what grep command I am trying to run.  The next three statements just add the parameters to the grep command and the third statement runs grep as a background process.  

Then the "MyStrList.LoadFromStream(MyProcess.Output);" loads any output from StdOut to a TStringList.  This can be a little dangerous, if there was a million lines of output, it could kill the main program.  But that has never happened to me, most modern computers should be able to handle a couple thousand lines of output. Finally I have a "FOR" loop that prints out any lines in the TStringList, I then clear the TStringList and any Parameters for the External process so new ones can be added in the next loop.  And lastly, I do a "MyQry.Next" which will go to the next record.

The end of the program just does some cleanup.

It ran relatively fast, 167 rows were greped and ran for less than half a second.