Home › Forums › Main Forums › SAS Forum › How to “impute” missing values with previous values?
-
How to “impute” missing values with previous values?
-
Q: Given the below dataset, for the Name variable, we want to impute the missing values with previous value. By default, they are same as the previous names.
name state date q1 q2 q3 q4
Jason NY 04-15-2010 very poor not satisfied yes no
Mike NY 6/12/2010 very good satisfied no yes
9/21/2010 very good satisfied no yes
3/8/2011 very good satisfied no yes
Mary CA 5-26-2010 good satisfied no yes
8/12/2009 good satisfied no yes
6/12/2012 very good not satisfied yes no
7/30/2003 bad satisfied no no
3/12/2008 good satisfied yes yes
Peter CA 12/3/2010 poor not satisfied yes no
12/11/2003 good satisfied yes no
10/23/2007 bad not satisfied no no -
Method A: Retain + Merge
filename Alice 'file path\alice.xls';
proc import datafile=Alice out=alice(rename=(Name_ = Name)) dbms=Excel replace;
run;*************** Condition: By default, the blank in Name means it has same value as the previous obs. ************;
data B(drop=pre);
length Pre $20 New_Name $20 ;
retain Pre ' ';
set Alice;if Name =" " then New_Name=Pre;
else New_Name=Name;
Pre=New_Name;
run;data First_Name ;
set B;
if New_Name = Name;
run;proc sort data=First_Name; by name; run;
proc sort data=B(keep=New_Name Q5) out=Full ;
by New_Name;
run;data final(drop=New_Name);
length name $20;
merge First_Name full(rename=(new_name=name)) ;
by name;
run; -
Method B:
Use SUM statement to create a counter variable CNT, then split the dataset into 2 parts based on CNT=0 or not. Then merge them back together. Drop and rename vars if needed. See below:
data W;
length CNT 3 New $20 ;
set Alice;
retain New ' ' ;
if Name =' ' then CNT +1;
else CNT=0;
if CNT=0 then New=Name;
run;data Old(drop=CNT);
set W;
where CNT=0;
run;
proc sort data=Old; by New; run;
proc sort data=W(keep= New Name Q5) out= New;
by New;
run;
data final;
merge old new;
by New;
run;Method 3: Use By—NotSorted option.
data B;
length New $10 Pre $10;
set Alice;
by Name notsorted;
retain Pre 'AAA';
F=first.Name;
L=last.Name;
Flag=F+L;
if flag=2 and not missing(Name) then New=Name;
else New=Pre;
Pre=New;
run;data Old;
set B;
if F=1 and not missing(Name) then output Old;
run;Then merge the OLD with B by New.
Reach-out:
Some people may not have SAS in their company. If so, can you only use SQL to do it? It will be much more challenging in coding. But no pain, no gain!
- This reply was modified 3 years, 9 months ago by Justin.
Log in to reply.