Software Training Institute in Chennai with 100% Placements – SLA Institute

Easy way to IT Job

Share on your Social Media

Unlock Data Insights: SAS Programming Tutorial for Beginners

Published On: April 30, 2025

Due to the widespread usage of SAS software for data analysis, business intelligence, and statistical reporting across numerous industries, there is still a high demand for SAS programmers. Both junior and senior SAS programmers are anticipated to be in high demand worldwide, with potential for remote and part-time work. Utilize this SAS programming tutorial to learn the fundamental concepts. Explore more with our SAS course syllabus.

Introduction to SAS Programming

Originally standing for Statistical Analysis System, SAS has developed into a robust and all-inclusive software package that may be used for a variety of activities, such as:

  • Data Management: Effectively arrange, purify, and work with data from a variety of sources with data management.
  • Business Intelligence: Convert unstructured data into insightful knowledge to support well-informed choices.
  • Advanced Analytics: Perform data mining, predictive modeling, and complex statistical studies.
  • Reporting and Visualization: To successfully convey findings, produce engaging reports and visualizations.

Consider SAS to be a powerful instrument that enables you to discover the potential that lies inside your data.

Core Components of SAS

SAS Program commonly has two fundamental building blocks:

DATA Step: Your data wrangler is this step. Here, you can either enter data directly into your program or read it from other files (such as databases, spreadsheets, or text files). 

Additionally, you can work with this data by combining datasets, filtering observations, calculating new variables, and much more.

PROC Step (Procedures): These are pre-written programs that work with your SAS datasets to accomplish particular tasks. 

There is probably a PROC step made for it, regardless of whether you want to compute descriptive statistics (such as mean, median, and standard deviation), produce tables, make charts, run statistical models (such as regression or ANOVA), or export your findings. 

Recommended: SAS Online Course Program.

SAS Environment

There are a few main windows that you will usually deal with when working with SAS:

  • Editor Window: This is the place where you write and modify your SAS programs, which include the DATA and PROC phases.
  • Log Window: This window functions similarly to the journal of your software. It shows information about how your program is running, such as any faults, cautions, or notes. It’s vital for troubleshooting and understanding what SAS is doing.
  • Output Window: It is also known as the results viewer, is where your PROC steps’ output, such as tables, reports, and any text-based output, appear.
  • Explorer Window: The Libraries Pane facilitates the management of your SAS files and allows you to link to several “libraries.” Your SAS datasets and other SAS files are stored in libraries, which are similar to folders.

Basic SAS Syntax

To get you started, consider these basic laws of syntax:

  • Semicolons (;) are used to conclude statements. It is a period at the conclusion of a sentence.
  • Generally, SAS is not case-sensitive. Data MyData; and DATA mydata; are interchangeable. It does matter, though, if the text is enclosed in quotation marks.
  • For readability, keywords are frequently capitalized. Take DATA, PROC, INPUT, and RUN, for instance.
  • Comments begin with an asterisk * and end with a semicolon ; or are enclosed in /*… */. When SAS runs your code, it disregards these notes.

Example:

DATA students;

  INPUT name $ age;

  DATALINES;

John 18

Jane 19

Peter 20

;

RUN;

PROC PRINT DATA=students;

RUN;

A DATA step is used to create and read data, and a PROC step is used to process and show that data. This straightforward program illustrates the fundamental architecture of an SAS application.

Review your skills: SAS Interview Questions and Answers.

Working with SAS Libraries

To organize and retrieve your SAS datasets and other SAS files, SAS libraries are essential. Consider an SAS library as a directory or folder that SAS can identify and utilize to store and access your work.

A named collection of SAS files is essentially what an SAS library is. These files may consist of:

  • SAS Datasets: SAS uses structured tables with rows and columns as its main method of storing data.
  • Catalogs: Various SAS objects, including formats, informats, graphics, and built DATA step views, are stored in catalogs.
  • Other SAS Files: SAS files include macro definitions and stored routines.

Types of SAS Libraries

You’ll mostly come across two kinds of SAS libraries:

Temporary Libraries (WORK): 

At the start of every SAS session, SAS automatically generates this default library.

  • It serves as a makeshift storage area. When your SAS session is over, any files or datasets you produce in the WORK library will be automatically removed.
  • SAS is aware of the existence of the WORK library, so you don’t need to describe it directly.
  • It’s helpful for keeping files or intermediate datasets that you only require for the current investigation.

Permanent Libraries: 

Libraries that you designate and associate with particular physical places on your computer or network.

  • Even when you end your SAS session, files and data kept in persistent libraries remain accessible.
  • A LIBNAME statement is required to inform SAS of the locations of these libraries.
  • When saving files and data that you need to access during several SAS sessions or projects, permanent libraries are crucial.

Explore our data science courses in Chennai.

The LIBNAME Statement: Connecting to Your Data

To use a permanent SAS library, you must link the physical path, the actual place on your file system, where your SAS files are or will be kept to an SAS libref, a brief, logical name that you’ll use within your SAS code. The LIBNAME declaration is used to do this.

The basic syntax of the LIBNAME statement is:

LIBNAME libref ‘physical-path’;

  • LIBNAME: The SAS keyword that tells SAS that we are defining a library.
  • libref: This is a special name you pick to represent your library in your SAS application. It can be up to eight characters long, beginning with a letter or underscore and consisting solely of letters, digits, or underscores. It functions similarly to a moniker for the actual place.
  • ‘physical-path’: The whole path to the directory containing your SAS files or the location where you wish to keep them is known as the “physical-path.” You must contain this path in single or double quotations.

Example:

Let’s say that you wish to put your SAS datasets in a folder called “SAS_Data” on your C drive. An SAS library called “MyData” could be defined in this way, pointing to this location:

LIBNAME MyData ‘C:\SAS_Data’;

RUN;

Once this LIBNAME command has been executed, you can use the libref “MyData” followed by a dot and the dataset name (e.g., MyData.customer_info) to access the datasets that are stored in the “C:\SAS_Data” folder.

Accessing Data in SAS Libraries

After defining a library, you can use your DATA and PROC steps to access the SAS datasets that are kept there.

Referencing Datasets: The two-level naming convention libref.dataset_name is used to refer to a dataset in a library. 

Example: You would call a dataset in the “MyData” library with the name “sales” MyData.sales.

Creating Datasets in a Library: The libref in the DATA statement allows you to select the library where you wish to keep a new SAS dataset that you create in a DATA step:

DATA MyData.new_customers;

  INPUT id name $ city $;

  DATALINES;

1 Alice Chennai

2 Bob Bangalore

;

RUN;

As a result, a new SAS dataset called “new_customers” will be created and stored in the physical location linked to the “MyData” libref (“C:\SAS_Data” in our example). SAS will store the dataset in the temporary WORK library by default if you don’t specify a libref.

Listing the Contents of a SAS Library

To view the SAS files kept in a specific library, utilize the CONTENTS procedure:

PROC CONTENTS DATA=_ALL_ NODS;

RUN;

PROC CONTENTS DATA=MyData._ALL_ NODS;

RUN;

  • DATA=_ALL_: The contents of every SAS library that is currently declared will be listed.
  • DATA=MyData._ALL_: The contents of the “MyData” library will be expressly listed.
  • The NODS option simply displays library members (datasets, catalogs, etc.) and conceals the display of variable details.

Managing your SAS projects and making sure that your data and analysis findings are saved and available when needed depend on your ability to work efficiently with SAS libraries. 

You may work together more efficiently and preserve your important data across SAS sessions by utilizing persistent libraries.

Related Training: Clinical SAS course in Chennai for healthcare employees.

The DATA Step: Creating and Manipulating Data in SAS

The DATA step generates one or more SAS datasets as output after reading input data and processing it in accordance with your instructions.

Basic Structure of a DATA Step

A DATA step typically follows this structure:

DATA output_dataset(s);

  [informat(s);]

  [input_specification(s);]

  [data_manipulation_statements;]

RUN;

DATA output_dataset(s);: This sentence names the SAS dataset(s) you wish to build and starts the DATA stage. 

[informat(s);]: Informats provides SAS with instructions on how to read external data into SAS variables. They define the incoming values’ format and data type. When reading from external files, this is frequently utilized.

[input_specification(s);]: The structure of your input data is described in the INPUT statement. In addition to naming the variables you wish to include in your SAS dataset, it also indicates their order and, occasionally, their data type.

[data_manipulation_statements;]: The magic truly begins here! SAS statements are written here to:

  • Assign values to variables.
  • Perform calculations.
  • Apply conditional logic (IF-THEN-ELSE).
  • Select or exclude observations (WHERE, IF).
  • Create new variables.
  • Modify existing variables.
  • Combine or split datasets.

RUN;: This statement instructs SAS to carry out your supplied instructions and marks the conclusion of the DATA stage.

Reading Data into the DATA Step

Data for the DATA phase comes from a variety of sources. Here are a few typical methods:

Reading from External Files (INFILE statement): The INFILE statement is used to describe the location of external files containing your data, such as CSV, TXT, or other delimited files. 

DATA employees;

  INFILE ‘C:\Data\employee_data.csv’ DELIMITER=’,’;

  INPUT employee_id name $ department $ salary;

RUN;

  • INFILE ‘C:\Data\employee_data.csv’: SAS is instructed to read data from the supplied file by using it.
  • DELIMITER=’,’: It signifies that commas are used to separate the values in the file.
  • INPUT employee_id name $ department $ salary;: It characterizes the variables and their arrangement. Name and department are character variables, as indicated by the $ symbol.

Reading Data Inline (DATALINES or CARDS statement): The DATALINES (or its earlier synonym CARDS) statement allows you to include small bits of data directly into your SAS application. 

DATA products;

  INPUT product_id name $ price;

  DATALINES;

101 Laptop 1200

102 Mouse 25

103 Keyboard 75

;

RUN;

Reading from Existing SAS Datasets (SET statement): The SET statement allows you to read data from one or more SAS datasets that are already in existence. Existing SAS datasets are frequently combined or further processed using this method.

DATA combined_sales;

  SET sales_q1 sales_q2;

RUN;

This example creates a new dataset called combined_sales by reading observations from the sales_q1 and sales_q2 datasets.

Recommended: Machine Learning Course in Chennai.

The PROC Step: Analyzing and Reporting Data

PROC steps are pre-written processes created to carry out particular operations on your SAS datasets. These activities can include creating intricate statistical models, producing a variety of reports and visualizations, and computing descriptive statistics.

Basic Structure of a PROC Step

This is the structure of a typical PROC step:

PROC procedure_name [options];

  [statements;]

RUN;

PROC procedure_name: The PROC step is started by this statement, which also provides the name of the procedure you wish to employ (e.g., PRINT, SORT, MEANS, FREQ, REG, SVG).

[options]: Many processes make decisions that alter their behavior or yield different outcomes. Usually, these options are listed with spaces between them, after the procedure name.

[statements;]: A PROC step may contain a number of statements that give the process additional instructions. Typical assertions consist of:

  • DATA=dataset_name;: Indicates the SAS dataset the function will utilize.
  • VAR variable(s);: The variables to be studied are specified by this.
  • BY variable(s);: Groups the data according to the variables that are specified.
  • CLASS variable(s);: Categorical variables are specified for classification or grouping using this.
  • TABLES variable(s);: Used to specify the tables to be generated in processes such as FREQ and TABULATE.
  • OUTPUT: Used to store results in a new SAS dataset in a variety of analytical processes.
  • Countless additional options and remarks unique to a treatment.

RUN;: This command instructs SAS to carry out the procedure using the given parameters and statements and marks the conclusion of the PROC stage.

Common and Useful PROC Procedures

SAS provides a large collection of PROC procedures. These are a few of the most often used ones, arranged according to their main purpose:

Data Listing and Basic Reporting:

PROC PRINT: Displays the SAS dataset’s observations. useful for seeing your data quickly.

PROC PRINT DATA=MyData.customers;

  VAR name age city; /* Specify variables to print */

  TITLE ‘Customer Information’; /* Add a title to the output */

RUN;

PROC CONTENTS: Offers details about the names, kinds, formats, and labels of variables as well as the structure of an SAS dataset or library.

PROC CONTENTS DATA=MyData.products NODS; /* Get dataset information */

RUN;

PROC CONTENTS LIBRARY=MyLib; /* Get library information */

RUN;

Sorting and Ordering Data:

PROC SORT: A SAS dataset’s observations are arranged using PROC SORT according to the values of one or more variables. The original sorted dataset can be overwritten or a new one can be produced.

PROC SORT DATA=MyData.orders OUT=SortedOrders;

  BY customer_id order_date; /* Sort by customer ID then by order date */

RUN;

Descriptive Statistics:

PROC MEANS: For numerical variables, it calculates descriptive statistics like mean, standard deviation, minimum, and maximum.

PROC MEANS DATA=MyData.sales;

  VAR price quantity total_amount;

  CLASS product_category; /* Calculate statistics for each product category */

RUN;

PROC FREQ: For categorical variables, PROC FREQ generates frequency tables and computes percentages. Cross-tabulations of several categorical variables can also be generated by it.

PROC FREQ DATA=MyData.demographics;

  TABLES gender education_level gender*education_level; /* Create frequency tables */

RUN;

PROC UNIVARIATE: It offers more thorough descriptive statistics, such as quantiles, tests for normality, and measurements of form (skewness, kurtosis).

PROC UNIVARIATE DATA=MyData.salaries;

  VAR salary;

  HISTOGRAM / NORMAL; /* Create a histogram with a normal curve overlay */

RUN;

Tabular Reporting:

PROC TABULATE: It produces incredibly adaptable tables with a range of statistics computed for category variable combinations.

PROC TABULATE DATA=MyData.survey;

  CLASS region product;

  VAR satisfaction_score;

  TABLE region, product*mean=’Average’*satisfaction_score;

RUN;

Statistical Analysis:

SAS provides numerous statistical analysis procedures, such as:

  • Regression Analysis (PROC REG, PROC LOGISTIC, PROC GLM): The link between dependent and independent variables is modeled using regression analysis.
  • Analysis of Variance (ANOVA) (PROC ANOVA, PROC GLM): It is used to compare means among several groups.
  • Time Series Analysis (PROC ARIMA, PROC FORECAST): For the analysis and prediction of time-dependent data.
  • Multivariate Analysis (PROC FACTOR, PROC CLUSTER, PROC DISCRIM): For investigating intricate correlations in data containing numerous factors.
  • Survival Analysis (PROC LIFETEST, PROC PHREG): For the analysis of time-to-event data.

Suggested: Data Analytics Training in Chennai.

Graphics and Visualization:

PROC SGPLOT (Statistical Graphics Procedure): A robust and adaptable process for producing a broad range of statistical graphics, including box plots, histograms, bar charts, scatter plots, and more. Here is the SAS sample code:

PROC SGPLOT DATA=MyData.temperatures;

  SCATTER x=month y=average_temp;

  SERIES x=month y=trend_line;

  TITLE ‘Average Monthly Temperatures’;

RUN;

Older Graphics Procedures (PROC GCHART, PROC GPLOT, PROC GMAP): Although it is still accessible, PROC SGPLOT is typically chosen for contemporary graphics.

Data Manipulation and Utility Procedures:

  • PROC SORT: In addition to being used for sorting, PROC SORT is also utilized for activities like eliminating duplicates using the NODUPKEY option.
  • PROC TRANSPOSE: A SAS dataset can be restructured by using PROC TRANSPOSE, which switches rows and columns.
  • PROC COPY: Converts SAS files and datasets between libraries.
  • PROC DELETE: Deletes SAS files or datasets permanently.

Recommended: Data Analytics and Business Intelligence Job Seeker Program.

How PROC Steps Work with DATA Steps

It is crucial to realize that PROC steps usually work with SAS datasets that were either created by a DATA step or may already be in existence. An SAS program’s flow typically includes:

  • DATA Step(s): To create one or more SAS datasets, read, clean, and prepare your data.
  • PROC step(s): Analyzing, condensing, reporting, or visualizing the data in those SAS datasets.

A single SAS program can have several DATA stages and PROC steps, which enables you to carry out a number of data processing and analytic operations.

Gaining access to SAS’s robust analytical engine and comprehensive reporting features through the PROC phase enables you to effectively convey and derive valuable insights from your data.

Explore all software training courses at SLA.

Conclusion

The key concepts for an extensive SAS programming instruction are covered in this SAS Programming Tutorial for Beginners. We may explore each of these topics in depth using justifications, illustrations, and hands-on activities through our SAS training in Chennai.

Share on your Social Media

Just a minute!

If you have any questions that you did not find answers for, our counsellors are here to answer them. You can get all your queries answered before deciding to join SLA and move your career forward.

We are excited to get started with you

Give us your information and we will arange for a free call (at your convenience) with one of our counsellors. You can get all your queries answered before deciding to join SLA and move your career forward.