Thursday, May 15, 2014

How to execute an operating environment command from SAS session

The topic has been discussed in so many SAS papers. Below are my considerations:

1. X statement / X command
It is the most popular one. I always use it in interactive mode. However, I would suggest that we should not use it in batch since it is difficult to control task and get status.

2. %SYSEXEC macro statement
As you know, macro string quoting always a challenge to SAS programmer. If there is any special character in the command, it may take you much time for troubleshooting. To make the code clean and readable, I prefer not using %SYSEXEC in our production code.

3. SYSTEM function / CALL SYSTEM routine
It works good with DATA step.

4. FILENAME statement PIPE engine
It is convenient to get the output of command.
Tip: to get the return code of the command, we can use this: FILENAME CMD PIPE "command; echo $?";

5. SYSTASK statement
It is my favorite one. It is because you will have more control on the command: execute many tasks in parallel, list tasks, kill task, and get status easily. Furthermore, it is the only way to execute command asynchronously.

Wednesday, May 14, 2014

colon(:) operator modifier - Note 2

WHERE statement is so useful tool to subset data. However, colon(:) operator modifier can not be used in WHERE statement.
Currently I can find some workarounds at below:
1.
PROC SQL truncated string comparison operators such as EQT, GTT, and LET:
They are undocumented operators because I can not find them in SAS document. For more information, please read SUGI paper 056-2009.

Please note that they have to be used in WHERE statement of PROC SQL.
2.
LIKE operator:
WHERE name like 'J%'; * all name with leading J character;

3.
To encapsulate the operator in FCMP:
proc fcmp outlib=work.funcs.trial;
   function func_in(a $, b $);
      if a =: b then result=1;
      else result=0;

      return (result);
   endsub;
run;

options cmplib=work.funcs;

data test;
    set sashelp.class;
    if func_in(name, 'J');
run;

Although they can not really replace the colon(:) operator modifier, they are useful to open your mind.

colon(:) operator modifier - Note 1

When string with different length are compared, the colon(:) operator modifier will be very helpful since it truncate longer string before comparison.
However, what is the "length"? Is it the actual length of of a non-blank character string (from LENGTH function)? or the amount of memory that is allocated for the string (from LENGTHM function)?

To clarify the question, please see the sample:
data a;
    length a1 a4 $10;
    a1 = 't';
    a4 = 'test';
    
    flag_a1 = ifc(a1=:'te', 'Y', 'N');
    flag_a4 = ifc(a4=:'te', 'Y', 'N');

    put a1= flag_a1=:;
    put a4= flag_a4=;
run;

Output:
a1=t flag_a1=N  (Explanation: a1 is not truncated because the memory length is 10. the implicit string comparison is 't' = 'te'.)
a4=test flag_a4=Y


Conclusion: the colon(:) operator modifier will truncate the longer string or pad pad with blank the shorter one.