Monday, February 23, 2009

Keep quotation in string

Have you ever needed different quotation in a string?
If it is a trouble for you, try the code below:


*Macro variable assignment;
%let str = %str(quotation %" and %'test);
%put %superq(str);

*Macro variable reference;
%let mv = %bquote(&str);
%put %superq(mv);

*Data Step;
data _null_;
str = 'quotation " and '' test';
put str=;
str = "quotation "" and ' test";
put str=;
run;

Thursday, February 19, 2009

Keyboard shortcut for "Collapse All" and "Expand All"

Are you bored with the mouse click.

Try the keyboard shortcut to save time.
Collapse All: Ctrl+Alt+"-"
Expand All: Ctrl+Alt+"+"
Collapse one node: Alt+"-"
Expand one node: Alt+"+"

Sunday, February 15, 2009

count word number using PRX

It is my way to count word number:

data _null_;
input;
line = prxchange('s/\b*(\w+)\b*/A/', -1, _infile_);
count = count(line, 'A');
put count=;
cards;
this is a test
;

++++++++++++++++++++++++++++++++++++++++++++++++

Apparently, the method above is outdated.
With SAS 9.2, we can count the words in a string easilier.
The functions countw do the trick.

Monday, February 9, 2009

PRX modifier "i" and "?"

Not all perl regular expressions are supported by PRX functions.

However, it does support two useful modifers: "i" and "?"
case-insensitive modifer "i":
'/substring/i'

parens non-capturing modifer "?":
'/(?:substring1)/'

And they can be used together:
'/(?i:substring)/'
'/(?-i:substring)/'
'/(?-i:substring1)substring2/i'

functions for substring searching

Generally, we can use 4 classes of search substring:
ANY-function, NOT-function (anyalpha)
INDEX (INDEXC, INDEXW)
FIND
PRXMATCH


The function family of any-function and not-function are for search of a character string. And they are for special purpose.
Sometimes, the function COMPRESS can used in practical application alternatively.

The power of subsring searching is
INDEX < FIND < PRXMATCH

And certainly, the complexity of usage
INDEX < FIND < PRXMATCH

I prefer PRXMATCH since it has most flexibility. Although it will take long time to learn, it is worth the effort.

Tuesday, February 3, 2009

defensive programming when using multiple INPUTs

When there are more than one INPUT statements in DATA step, we should care about end-of-file issue. Before INPUT, we should check the enf-of-file to avoid DATA step execution stop.

It should be something like below:

infile in end=end; *or eof= option;
input;
* something here;
if not end then do;
input;
* something here ;
end;
* something here;

if end then do;
* something here;
end;

Sunday, February 1, 2009

special numeric missing value

Generally, we judge the numeric missing value with "varname = .".
However, it is not good SAS habit because it may miss special numeric missing value.

From online doc, we can get the definition of Special numeric missing value:
is a type of numeric missing value that enables you to represent different categories of missing data by using the letters A-Z or an underscore.

That means there may be at most 28 numeric missing values as follows:
._
.
.A-.Z

To avoid unexpected missing, we should use MISSING and NMISS for convenience.

For more explanation, we can read SUGI31 paper: MISSING! - Understanding and Making the Most of Missing Data