Example: Reading_Multiple_files.sas

This SAS program is for reading multiple files to SAS. Suppose you receive a large amount of files from your client, all the files has the same layout. You can write an INPUT statement to read one of the files to SAS and use the following Macro to automatically load all files to SAS and combine all data set to a single SAS data set. A basic knowledge of SAS Macro language is required for this example.

Suppose the file you received 9 files (or a few thousands files) from your client. The file names are: dumfile1.dat, dumfile2.dat, dumfile3.dat, dumfile4.dat, dumfile5.dat, dumfile6.dat, dumfile7.dat, dumfile8.dat, dumfile8.dat. Each has the following layout.

 

A     B       C          D             E

2   2    1     0     869

1   2    2     0      98

2   2    2     0      88

1   2    3     0      15

2   2    3     0      18

1   2    4     0       9

2   2    4     0       7

1   2    5     0       6

2   2    5     0       1

1   2    6     0      30

2   2    6     0      26

(more data lines)

 

Your SAS program to read these files could be as simple as the follows:

 

data dumfile1;                                                                                                                                                                                                      

  infile "c:\temp\dumfile1.dat; 

  input a b c d e;

run;

 

data dumfile2;                                                                                                                                                                                                      

  infile "c:\temp\dumfile2.dat; 

  input a b c d e;

run;

.

.

.

data dumfile9;                                                                                                                                                                                                      

  infile "c:\temp\dumfile9.dat; 

  input a b c d e;

run;

 

You combine these files by the following SAS code:

 

data combined;

  set dumfile1

      dumfile2

      dumfile3

      dumfile4

      dumfile5

      dumfile6

      dumfile7

      dumfile8

      dumfile9

  ;

run;

 

The SAS program in this example does exactly the above. When you need to read a large number of files, you will find it is very efficient to use the following code.

 

 

SAS Program: reading_multiple_files.sas

 

 

/********************************************************************************************

 * program:    reading_multiple_files.sas                                                   *

 * programmer: Tugluke Abdurazak                                                            *

 * purpose:    Macro to read multiple files to SAS.                                         *

 ********************************************************************************************/

 

options ls = 145 ps = 50 mprint pageno=1;

 

title 'Reading multiple data files to SAS';

footnote 'This is an Example';

 

libname out  'c:\temp';

 

*--- List of files to be read;

%let flist = dumfile1 dumfile2 dumfile3 dumfile4 dumfile5 dumfile6 dumfile7 dumfile8 dumfile9;

 

*---Raw file location;

%let fpath = c:\temp;

 

*--- Macro loopit loop through the list of file names and read them in;

 

%macro loopit( varlist =, fn=, firstobs=, lrecl= );  

 

  %local i ;                                                                                                                                                                                                             

  %let i = 1;                                                                                                                                                                                                           

  %do %until (%scan(&varlist,&i,%str( )) = %str());                                                                                                                                                                     

    %let var&i = %scan(&varlist,&i,%str( ));                                                                                                                                                                             

      data &&var&i;                                                                                                                                                                                                      

        length source $10. ;

        infile "&fpath\&&var&i...dat"  firstobs=&firstobs lrecl=&lrecl pad; 

 

            *---Input statement: May need to change;

        input a b c d e;

            source = "&&var&i" ;

        run;

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      

      *---Combine all data sets;

      %if &i = 1 %then                                                                                                                                                                                                   

        %do;                                                                                                                                                                                                             

           data out.&fn ;                                                                                                                                                                                          

             set &&var&i;                                                                                                                                                                                           

           run;                                                                                                                                                                                                     

        %end;                                                                                                                                                                                                       

        %else                                                                                                                                                                                                        

          %do;                                                                                                                                                                                                       

            proc append base=out.&fn data=&&var&i;                                                                                                                                                               

            run;                                                                                                                                                                                                  

          %end;                                                                                                                                                                                                     

                                                                                                                                                                                                                         

      %let i = %eval(&i+1);                                                                                                                                                                                             

                                                                                                                                                                                                                         

  %end;

 

  %let varcnt = %eval(&i-1);                                                                                                                                                                                             

  %put The number of data sets being read are &varcnt ;                                                                                                                                                                  

     

  proc contents data= out.&fn;                                                                                                                                                                                    

    title2 "Contents of &fn" ;                                                                                                                                                                                       

  run;                                                                                                                                                                                                                  

 

  proc print data=out.&fn (obs=20) ;                                                                                                                                                                                

    title2 "20 observations in &fn " ;                                                                                                                                                                            

  run;                                                                                                                                                                                                               

                                                                                                                                                                                                                              

  proc freq data=out.&fn;                                                                                                                                                                                         

    title2 "Counts by soure data in &fn " ;                                                                                                                                                                      

    tables source ;                                                                                                                                                                                                     

  run;                                                                                                                                                                                                                   

                                                                                                                                                                                                                         

%mend loopit;              

 

 

*--- Execute macro;

options obs=max;

%loopit( varlist =&flist, fn=testsds, firstobs=1, lrecl=24 ); 

 

/******************************End of Program************************************************

 

SAS Output

           Reading multiple data files to SAS

              20 observations in testsds

 

      Obs     source     a    b    c    d      e

 

        1    dumfile1    1    1    1    0    163

        2    dumfile1    2    1    1    0     34

        3    dumfile1    1    1    2    0    148

        4    dumfile1    2    1    2    0    957

        5    dumfile1    1    1    3    0    128

        6    dumfile1    2    1    3    0    110

        7    dumfile1    1    1    4    0    221

        8    dumfile1    2    1    4    0    227

        9    dumfile1    1    1    5    0      6

       10    dumfile1    2    1    5    0      9

       11    dumfile1    1    1    6    0    539

       12    dumfile1    2    1    6    0    526

       13    dumfile1    1    2    1    0    866

       14    dumfile2    2    2    1    0    869

       15    dumfile2    1    2    2    0     98

       16    dumfile2    2    2    2    0     88

       17    dumfile2    1    2    3    0     15

       18    dumfile2    2    2    3    0     18

       19    dumfile2    1    2    4    0      9

       20    dumfile2    2    2    4    0      7

 

 

                         Reading multiple data files to SAS                              

                                Contents of testsds

 

                              The CONTENTS Procedure

 

 Data Set Name: OUT.TESTSDS                              Observations:         813

 Member Type:   DATA                                     Variables:            6

 Engine:        V8                                       Indexes:              0

 Created:       14:40 Thursday, October 2, 2003          Observation Length:   56

 Last Modified: 14:40 Thursday, October 2, 2003          Deleted Observations: 0

 Protection:                                             Compressed:           NO

 Data Set Type:                                          Sorted:               NO

 Label:

 

 

                   -----Engine/Host Dependent Information-----

 

               Data Set Page Size:         8192

               Number of Data Set Pages:   6

               First Data Page:            1

               Max Obs per Page:           145

               Obs in First Data Page:     117

               Number of Data Set Repairs: 0

               File Name:                  c:\temp\testsds.sas7bdat

               Release Created:            8.0202M0

               Host Created:               WIN_PRO

 

 

               -----Alphabetic List of Variables and Attributes-----

 

                        #    Variable    Type    Len    Pos

                        ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

                        2    a           Num       8      0

                        3    b           Num       8      8

                        4    c           Num       8     16

                        5    d           Num       8     24

                        6    e           Num       8     32

                        1    source      Char     10     40

 

 

 

                Reading multiple data files to SAS

                Counts by source data in testsds

 

                       The FREQ Procedure

 

                                       Cumulative    Cumulative

  source      Frequency     Percent     Frequency      Percent

  ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

  dumfile1          13        1.60            13         1.60

  dumfile2          86       10.58            99        12.18

  dumfile3         100       12.30           199        24.48

  dumfile4          30        3.69           229        28.17

  dumfile5          44        5.41           273        33.58

  dumfile6          70        8.61           343        42.19

  dumfile7         220       27.06           563        69.25

  dumfile8         150       18.45           713        87.70

  dumfile9         100       12.30           813       100.00