Tuesday, December 9, 2008

How to sort dataset with format?

Every SAS programmer must be familiar with Proc SORT. Every time we merge multiple datasets, do group-by processing, etc. we have to sort the dataset by key variables.

In many procs, Format statement can function partly as group-by processing, e.g. Proc MEANS, Proc REPORT etc.

However, different from the procs above, Format statement dose not do group-by processing in Proc SORT.
The Format statement in Proc SORT ONLY assign the format to the variable in output dataset.

Generally, we create a new variable based on the format and then sort the new variable using Proc SORT.
With Proc SQL, we can do the same operations in one step. :)

Sample code:

proc format;
value $sex
'M' = '1'
'F' = '2';
run;

* Format statement does not work in Proc SORT;
proc sort data=sashelp.class out=sortfail;
format sex $sex.;
by sex;
run;

* With Proc SQL, we can sort the dataset with format
in one step;
proc sql;
create table sortok as
select * from sashelp.class
order by put(sex, $sex.);
quit;

* More steps with "by vars NOTSORTED";

No comments: