Format, First Varible, Last Varible, Proc compare, Retain, Proc Transpose, Proc Append

SAS Formats

SAS Formats decide how to display values of a variable. They define the appearance of variables when they are printed or exported. For example, you can use a format to display a numeric variable as a currency, percentage. Formats do not change the underlying data values; they only affect their presentation.

Below is a list of common SAS Formats.

Data TypeFormatsExplanation
Character$w.Displays character values of length w.
Numericw.dDisplays numeric values of length w with d decimal points
DateDATEw.Displays SAS date values of length w.
TimeTIMEw.dDisplays time values in the form hh:mm:ss.ss

Character Formats

  • $w.: Displays character values to a specified width.
  • $UPCASEw.: Displays character values in uppercase and optionally truncates them to a specified width.

The FORMAT statement is used to format values in a specific format. Here we are using the $UPCASEw. format.

Numeric Formats

DOLLARw.d: Displays numeric values as currency with a dollar sign, commas, and a specified number of decimal places.

COMMAw.d: Displays numeric values with commas and a specified number of decimal places.

PERCENTw.d: Displays numeric values as percentages with a specified number of decimal places.

The “dollar8.” format specifies that the values of the “msrp” variable should be displayed as currency with a dollar sign and commas, with a width of 8 characters. The “msrp” variable represents the manufacturer’s suggested retail price of the cars.

data mydata;
set sashelp.cars; 
format msrp dollar8.;

run;

Date Formats:

The line format date date11.; specifies that the “date” variable in the “mydata” dataset should be formatted using the “date11.” format. This format represents the date values in the format “dd-MON-yyyy“. The resulting dates will have a width of 11 characters.

data mydata;

set sashelp.pricedata;

format date date11.;

run;

Example 1 : Suppose you are asked to group MSRP variable based on the following conditions and check the number of observations falling in each groups.

  1. Values greater than 40,000 should be labeled as ‘High’
  2. Values between 26,000 and 40,000 should be labeled as ‘Medium’
  3. Otherwise, label them as ‘Low’
proc format;
value range
40000-high='High'
26000-< 40000='Medium'
other ='Low';
run;

proc freq data = sashelp.cars;

table msrp;
format msrp range.;

run;proc freq data = sashelp.cars;

table msrp;
format msrp range.;

run;

Introduction : PROC COMPARE

PROC COMPARE in SAS is used to compare the contents and structure of two datasets. It returns a summary of both the similarities and differences found between two datasets.

Let’s compare two built-in SAS datasets: sashelp.class and sashelp.classfit.

proc compare

 base = sashelp.class

 compare = sashelp.classfit;

run;

Uses of RETAIN Statement

The RETAIN statement simply copies retaining values by telling the SAS not to reset the variables to missing at the beginning of each iteration of the DATA step. If you would not use retain statement then SAS would return missing at the beginning of each iteration.

data abcd;
input x y;
cards;
1 25
1 28
1 27
2 23
2 35
2 34
3 25
3 29
;

run;

Suppose you need to generate a serial number (or row index number) with data step.

data aaa;

set abcd;

retain z 0;

z = z + 1;

run;

What does PROC TRANSPOSE do?

PROC TRANSPOSE is useful when you want to reshape your data in SAS. For example, if your data is in a vertical format but you want to convert it into a wide/horizontal format, PROC TRANSPOSE can do this task easily.

Sample Dataset

Let’s create sample data which is used for explaining the TRANSPOSE procedure. Suppose you have data for students with their marks in respective subjects. In the dataset below, you have three variables ‘Name’, ‘Subject’ and ‘Marks’.

data transp;

input Name $ Subject $ Marks;

cards;

Samma Maths 96

Sandy English 76

Devesh German 76

Rakesh Maths 50

Priya English 62

Kranti Maths 92

William German 87

;

run;

Simplest Form of PROC TRANSPOSE

proc transpose data = transp out= outdata;

run;

The above code creates a dataset called outdata which contains values of variable ‘Marks’ stored in horizontal (wide) format. In other words, it transposes only variable “Marks” which is numeric. It is because by default, PROC TRANSPOSE transposes all numeric variables in the dataset.

Options in PROC TRANSPOSE

  1. The NAME= option allows you to change the name of the _NAME_ variable. It is the name of the variable that is transposed.
  2. The PREFIX= option allows you to change the prefix “COL“. It is prefix to the transposed values.

proc transpose data = transp name=VarName prefix=Student out= outdata;

run;

PROC APPEND

The basic syntax of PROC APPEND is as follows :

PROC APPEND BASE=output-dataset DATA=source-dataset;
RUN;

BASE= : Specify the name of the dataset where the source dataset will be added. DATA= Specify the name of the dataset that needs to be added.

data mydata1;
  input Product $ Sales Profit;
  datalines;
A 110 23
B 147 31
C 238 51
D 207 42
;
run;
 
data mydata2;
  input Product $ Sales Profit;
  datalines;
E 81 13
F 87 17
G 69 10
;
run;

PROC APPEND BASE=mydata1 DATA=mydata2;

RUN;

Use the FORCE option within PROC APPEND to tell SAS to avoid mismatch of variable length while appending datasets.

PROC APPEND BASE=mydata3 DATA=mydata4 force;

RUN;

When you have datasets that contain different variable names, FORCE option within PROC APPEND would let them combined but returns missing values for those variables.

data team1;

  input Names $ Scores Matches;

  datalines;

A 110 23

B 147 31

C 238 51

D 207 42

;

run;

data team2;

  input Players $ Goals Matches;

  datalines;

E 81 13

F 87 17

G 69 10

;

run;

PROC APPEND BASE=team1 DATA=team2 force;

RUN;

proc print;

Run;

HERE, Base Table set is team1.