Monday, June 6, 2022

Summary aggregate using Hash

Below are the key statements to understand hash "Maintaining Key Summaries". It is powerful technique when you understand how it works. However, avoid using SUMINC with MULTIDATA/DO_OVER as they are affecting the sum value and I can not find any explanation in SAS doc.

quote:
The summary value of a hash key is initialized to the value of the SUMINC variable whenever the ADD or REPLACE method is used.
The summary value of a hash key is incremented by the value of the SUMINC variable whenever the FIND, CHECK, or REF method is used.
Note that the SUMINC variable can be negative, positive, or zero valued. The variable does not need to be an integer.The SUMINC value for a key is zero by default.
unquote:
data sample;
    do key = 1,2,3;
        value = key*2;
        output; output;
    end;
run;

data _null_ ;
    if _n_ = 1 then do ;
        declare hash h (suminc:"value", dataset:"sample") ;
        h.defineKey("key") ;
        h.defineDone();

        do until (eof) ;
            set sample end=eof ;
            h.check();
        end;
    end ;

    file print ;

    set sample;
    by key;

    if first.key then do ;
        h.sum(sum:Total);
        put key @10 Total;
    end;
run;

No comments: