Hello,
I have 3 large datasets with over 800 variables that I need to merge. Problem is some of the datasets have the same variable names so I need to rename all the variables in each data set prior to merging. The renaming would be systematic as I would like all the variables in dataset1 to have an 'a' in front of the variable name, all variables in dataset2 need to have a 'b' in front of the current variable name and all the variables in dataset3 need to have a 'c' in front of the current variable name. I have ran the following code:
Proc contents data=work.dataset1 out=varnames;
data _null_;
set varnames end=eof;
if _n_=1 then put "rename";
newvarname= 'a'| | trim(name);
put name '= ' newvarname;
if eof then put ';';
run;
I then copied and pasted the renamed list from the output into a new data step (note I only pasted a few variables here):
data dataset1renamed (rename=
( External_Identifiers = birthExternal_Identifiers)
( GA_at_birth = birthGA_at_birth)
( Health_card_number = birthHealth_card_number)
( apgar01 = birthapgar01)
(apgar05 = birthapgar05)
(apgar10 = birthapgar10)
(arterial_cord_blood_base_excess_ = birtharterial_cord_blood));
set dataset1;
output;
run;
However when I try to run this code I get the following error messages for each variable that I tried to rename:
ERROR 22-7: Invalid option name GA_AT_BIRTH.
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, /, ;, _DATA_, _LAST_, _NULL_.
ERROR 76-322: Syntax error, statement will be ignored.
Does anyone have any suggestions to either fix my code or maybe a better way to rename all the variables in a dataset?
Thanks so much for your help!
I have 3 large datasets with over 800 variables that I need to merge. Problem is some of the datasets have the same variable names so I need to rename all the variables in each data set prior to merging. The renaming would be systematic as I would like all the variables in dataset1 to have an 'a' in front of the variable name, all variables in dataset2 need to have a 'b' in front of the current variable name and all the variables in dataset3 need to have a 'c' in front of the current variable name. I have ran the following code:
Proc contents data=work.dataset1 out=varnames;
data _null_;
set varnames end=eof;
if _n_=1 then put "rename";
newvarname= 'a'| | trim(name);
put name '= ' newvarname;
if eof then put ';';
run;
I then copied and pasted the renamed list from the output into a new data step (note I only pasted a few variables here):
data dataset1renamed (rename=
( External_Identifiers = birthExternal_Identifiers)
( GA_at_birth = birthGA_at_birth)
( Health_card_number = birthHealth_card_number)
( apgar01 = birthapgar01)
(apgar05 = birthapgar05)
(apgar10 = birthapgar10)
(arterial_cord_blood_base_excess_ = birtharterial_cord_blood));
set dataset1;
output;
run;
However when I try to run this code I get the following error messages for each variable that I tried to rename:
ERROR 22-7: Invalid option name GA_AT_BIRTH.
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, /, ;, _DATA_, _LAST_, _NULL_.
ERROR 76-322: Syntax error, statement will be ignored.
Does anyone have any suggestions to either fix my code or maybe a better way to rename all the variables in a dataset?
Thanks so much for your help!