您的当前位置:首页正文

SUGI 24 Advanced Features of PROC TABULATE

2020-06-14 来源:钮旅网
Hands-on WorkshopsPaper 153Advanced Features of PROC TABULATE

Thomas J. Winn Jr., Texas State Comptroller's Office, Austin, Texas

ABSTRACT

The hands-on workshop associated with this paper will provideparticipants with practical information regarding someadvanced features of PROC TABULATE. It will include areview of the basic syntax for PROC TABULATE. It also willcover methods for customizing the appearance of tables

generated by PROC TABULATE (formatting, modifying tableoutlining characters, removing separator lines, changingheadings, and changing the width of cells), for using PROCTABULATE to compute and to display percentages and otherquotients, and for using PROC TABULATE for displayingstatistical results.

DATA USED FOR EXAMPLES

This paper will utilize a certain SAS data file, PRDSALE, fromthe Release 7.00 SASHELP data library, as well as a SAS datafile created by the following SAS DATA step (representing ahypothetical used car dealer’s current inventory of Saturnautomobiles):

DATA SATURN;

LENGTH COLOR $10 TRANSM $9;

INPUT YEAR MODEL $ PRICE MILEAGE COLOR $ TRANSM $ AC; DATALINES;

96 SC2 12475 23705 white automatic 196 SL2 8488 55460 black manual 1

99 SL1 14007 3015 dark_green automatic 096 SL2 11048 44480 purple automatic 096 SC2 11224 14433 red manual 0

94 SC2 7884 85201 med-blue automatic 097 SW2 12668 31912 white automatic 195 SL1 9464 29734 blue_green automatic 196 SL2 11264 24424 copper automatic 193 SL2 8442 49770 blue_green automatic 097 SL2 12996 19830 white automatic 198 SC2 13848 20004 white automatic 098 SL1 12998 13454 silver automatic 196 SL2 10994 67663 light_plum manual 197 SW2 13628 20789 gold automatic 1

96 SL1 9996 44084 dark_green automatic 195 SL1 8588 67716 med_blue manual 195 SC2 8968 80931 red manual 0

93 SL2 7965 77004 med_blue automatic 197 SC2 13998 29778 silver automatic 196 SL2 12248 14543 gold automatic 194 SC1 8996 57688 plum manual 195 SC1 9428 49862 red manual 1

96 SC2 11468 64401 purple automatic 197 SW2 12994 17422 white manual 0;;;;

REVIEW OF ELEMENTARY CONCEPTSREGARDING PROC TABULATE

Introduction to PROC TABULATE

PROC TABULATE is used to build tabular summary reportscontaining descriptive statistical information, including

hierarchical relationships among variables. PROC TABULATE

is the SAS® System’s implementation of TPL (Table

Producing Language), which was developed at the U.S. Bureauof Labor Statistics during the 1970’s, for generating tabularreports of descriptive statistics involving employment data.PROC TABULATE is more powerful for producing tabulationsthan PROC FREQ, and it is a more flexible statistical reportwriter than PROC MEANS.

Basic Syntax for PROC TABULATE

A general form for a PROC TABULATE step is: PROC TABULATE ; CLASS class-variable-list; VAR analysis-variable-list;

TABLE < row-expression,> column-expression ; BY variable-1 < … variable-n>; FORMAT variable-list-1 format-1

< … variable-list-n format-n>; FREQ variable;

KEYLABEL keyword-1=‘description-1’

< … keyword-n=“description-n’>;

LABEL variable-1=‘label-1’ <…variable-n=‘label-n’>; WEIGHT variable; TITLE ‘text’;

The PROC TABULATE statement invokes the procedure andspecifies certain options. The most commonly-used optionsare: DATA=SAS-data-file, which specifies the data set to beused as input by the procedure, and FORMAT=numeric-format, which determines the width of each cell (the

intersection of a row and a column of a PROC TABULATEsummary table) in the table. The MISSING option requeststhat missing values be regarded as valid levels for classificationvariables. Unless the MISSING option is specified,

observations with missing values for class variables will not beincluded in the analysis. The NOSEPS option removes theinterior horizontal lines from the printed report.

It is a requirement that every PROC TABULATE step mustinclude a PROC TABULATE statement, either a CLASSstatement or a VAR statement, or both, and a TABLE

statement. Each variable appearing in a TABLE statementmust be mentioned in either the CLASS statement or the VARstatement. Variables may not be mentioned in both the CLASSand the VAR statements.

Coding PROC TABULATE

Before writing any SAS code for PROC TABULATE, theprogrammer needs to decide what the final report should looklike. It is essential to have the fundamental design of the

pages, row, and columns of the table clearly in mind. This isthe vision which will determine the particular code to be used.The following is an example of a table generated by PROCTABULATE from the SATURN data set:

dcccccccccccccecccccccccccccccccccecccccccccfbb75$160bbbgcccccccccecccccccccibbbDXWRPDWLFbPDQXDOb$OObbgccccccccchccccccccchcccccccccibb1b1b1bgccccccecccccchccccccccchccccccccchcccccccccib<($5b02'(/bbbbgcccccchccccccibbbb󰀜󰀖b6/󰀕b󰀕b󰀓b󰀕bgcccccchcccccchccccccccchccccccccchcccccccccib󰀜󰀗b6&󰀔b󰀓b󰀔b󰀔bbgcccccchccccccccchccccccccchcccccccccibb6&󰀕b󰀔b󰀓b󰀔bgcccccchcccccchccccccccchccccccccchcccccccccib󰀜󰀘b6&󰀔b󰀓b󰀔b󰀔bbgcccccchccccccccchccccccccchcccccccccibb6&󰀕b󰀓b󰀔b󰀔bbgcccccchccccccccchccccccccchcccccccccibb6/󰀔b󰀔b󰀔b󰀕bgcccccchcccccchccccccccchccccccccchcccccccccib󰀜󰀙b6&󰀕b󰀕b󰀔b󰀖bbgcccccchccccccccchccccccccchcccccccccibb6/󰀔b󰀔b󰀓b󰀔bbgcccccchccccccccchccccccccchcccccccccibb6/󰀕b󰀖b󰀕b󰀘bgcccccchcccccchccccccccchccccccccchcccccccccib󰀜󰀚b6&󰀕b󰀔b󰀓b󰀔bbgcccccchccccccccchccccccccchcccccccccibb6/󰀕b󰀔b󰀓b󰀔bbgcccccchccccccccchccccccccchcccccccccibb6:󰀕b󰀕b󰀔b󰀖bgcccccchcccccchccccccccchccccccccchcccccccccib󰀜󰀛b6&󰀕b󰀔b󰀓b󰀔bbgcccccchccccccccchccccccccchcccccccccibb6/󰀔b󰀔b󰀓b󰀔bgcccccchcccccchccccccccchccccccccchcccccccccib󰀜󰀜b6/󰀔b󰀔b󰀓b󰀔bgcccccckcccccchccccccccchccccccccchcccccccccib$OOb󰀔󰀚b󰀛b󰀕󰀘bjccccccccccccckccccccccckccccccccckccccccccclThe simplest way to approach coding for PROC TABULATE isin terms of the following five steps:

• Specification of classification variables,• Specification of analysis variables,• Definition of dimensions of table,

• Identification of desired statistics, and• Labeling and formatting the table.

For a complete explanation of this approach, see the papers byVirgile (1997) or Winn (1998), noted in the References at theend of this paper.

Step 1 -- Specification of Classification VariablesThe CLASS statement is used to specify any categoricalvariables which will be used for grouping purposes in theanalysis. These variable-names will be the page, row, andcolumn headers of the table. In most cases, the CLASSstatement is required.

Hands-on WorkshopsPROC TABULATE DATA=SATURN; CLASS YEAR MODEL TRANSM; . . .

Step 2 -- Specification of Analysis VariablesThe VAR statement is used to list any variables which will beused for computing the statistics which are intended to appearin the body of the table. Frequency counts can be computedwithout an analysis variable. However, under most conditions,the VAR statement is required.

PROC TABULATE DATA=SATURN; CLASS YEAR MODEL TRANSM; VAR PRICE MILEAGE; . . .

Step 3 -- Definition of the Table’s DimensionsThe TABLE statement is used to define both the arrangementof the rows and columns of the table, as well as the requestsfor any summary statistics. It is the tricky aspect of using

PROC TABULATE. So let’s consider this complex statementone piece at a time.

In a TABLE statement, the comma is a very important symbol,because it separates the dimensions of the table. If twocommas were specified, then the table would have threedimensions, and the order would be pages, rows, and columns.If only one comma was specified, then the table would have twodimensions, and the order would be rows, columns. Nocomma would be interpreted to mean that the table’s onlydimension would be the column dimension. The table wouldonly have one row.

So, TABLE statements look like the following:

TABLE page-expression, row-expression, column-expression . . .;

TABLE row-expression, column-expression . . .; orTABLE column-expression . . .;In this context, an expression can consist of variables,statistics, operators, format specifications, and label

assignments. Only three operators are needed to specify thepage, row and column headings which identify the structure ofa table.

An asterisk (*) can be used to cross the classificationvariables; that is, to arrange them in a nested manner,

according to the order listed (top, middle, and lower). A blankspace is used to concatenate two classification variables(which will appear in the table: top-to-bottom for row headings,left-to-right for column headings). Parentheses ( ) are used togroup the elements of an expression, and to associate anadjacent operator with each concatenated element inside theparentheses.

Step 4 -- Identification of Desired StatisticsUp to this point, we have only focused on the dimensions of thetables, and our TABLE statements have been incomplete. TheTABLE statement also specifies which summary statistics

should be produced, and pertaining to which analysis variables.Each statistic is identified by a keyword.

N = the number of observations, the frequency count,MIN = the smallest value,MAX = the largest value,

MEAN = the arithmetic mean, or the average,STD = the standard deviation,VAR = the variance,

SUM = the sum of the values,

PCTN = the percentage that one frequency is of anotherfrequency,

PCTSUM = the percentage that one sum is of anothersum,

. . . and other descriptive statistics.

(Note – PROC TABULATE in Version 7 of the SASSystem has numerous statistics not available in previousreleases.)Whenever you cross a variable with a keyword for a statistic,you are identifying the statistic to be applied to that variable(which tells PROC TABULATE what type of calculation toperform). You can cross class variables only with the N orPCTN statistics. By default, if the TABLE statement does notinclude an analysis variable or a statistic, then PROCTABULATE automatically crosses the N statistic with theindicated class variables. Analysis variables can be crossedwith any statistic. By default, if the TABLE statement includesan analysis variable but without crossing it with any statistic,PROC TABULATE automatically crosses it with SUM.Here are some examples:

PROC TABULATE DATA=SATURN; CLASS YEAR MODEL TRANSM; VAR PRICE;

TABLE YEAR*MODEL, TRANSM;

TABLE YEAR*MODEL, TRANSM*PCTN;

TABLE YEAR*MODEL, TRANSM*PRICE*MEAN; TABLE YEAR*MODEL, TRANSM*PRICE*PCTSUM;PROC TABULATE has a universal class variable, ALL, whichcan be used to generate totals for any specified class variable.Just concatenate the keyword ALL into the row or columnexpression of a TABLE statement. Whenever the ALL is

concatenated with other elements in a dimension (page, row, orcolumn), PROC TABULATE summarizes all of the categorieswithin that dimension. Moreover, whenever the ALL is usedwithin a grouping defined by parentheses, PROC TABULATEwill summarize only the categories within the grouping.

PROC TABULATE DATA=SATURN; CLASS YEAR MODEL TRANSM; VAR PRICE;

TABLE YEAR*MODEL ALL, N PRICE*MEAN; TABLE YEAR*MODEL ALL, PRICE*PCTSUM; TABLE (YEAR ALL)*MODEL,

(TRANSM ALL)*(N PRICE*MEAN);

Step 5 -- Labeling and Formatting the TableNow that we know how to define the basic structure of thetables we will generate, we would like to be able to make thetables more self-explanatory; that is, easier to read andinterpret.

As in many other SAS procedures, you can use a LABELstatement to replace variable names with more descriptiveheadings for your class variables.

PROC TABULATE DATA=SATURN; CLASS YEAR MODEL TRANSM; VAR MILEAGE;

LABEL MILEAGE=’Odometer Reading’;

TABLE YEAR*MODEL ALL, N MILEAGE*MEAN;There also is a way to specify temporary labels in a TABLEstatement.

Hands-on WorkshopsPROC TABULATE DATA=SATURN; CLASS YEAR MODEL TRANSM; VAR MILEAGE;

TABLE YEAR*MODEL ALL=’All Autos Combined’, N MILEAGE=’Odometer Reading’*MEAN;To assign labels to procedure-generated statistics and theuniversal class variable, we use the KEYLABEL statement. KEYLABEL N = ‘Count‘ ALL = ‘Total’

PCTN = ‘Percent’;

To suppress the label for a procedure-generated statistic, usea statement like the following:

KEYLABEL N = ’ ‘;You can use the FORMAT= option of the PROC TABULATEstatement and the F= format modifiers in the TABLE statementto format the contents of table cells, and to set the width of acolumn.

TITLE and FOOTNOTE statements also can be used to

enhance the tabular reports generated by PROC TABULATE,in the same manner as these statements are used in otherSAS report-writing procedures.

COMPUTING AND DISPLAYINGPERCENTAGES

Using PROC TABULATE to compute percentages is a bittricky. PCTN and PCTSUM are the keywords which specify,respectively, the percentage of the value in a particular cell tothe value in another cell, or to the total of a group of cells.Whenever someone performs percentage calculations, it isessential to be very clear concerning which number is to beused as the denominator in the computation. PROCTABULATE allows a programmer to identify the specificdenominator to be used in computing percentages.

By default, if a denominator is not explicitly defined in a TABLEstatement which includes percentages, then PROC

TABULATE uses the total of the N-frequency-counts for theentire data set as the denominator in a PCTN calculation, andthe total of all of the SUM cells for the denominator in aPCTSUM calculation – in each case, the percentage is

computed based upon all of the observations in the data set. Ifthat is what a programmer intends, then specifying PCTN orPCTSUM in a TABLE statement isn’t much different thanrequesting any other summary statistic. In that case, youwouldn’t need to define the denominator, because the defaultdenominator is implicitly assumed (see the examples, above, orthe first two of the examples which follow).

However, whenever it is necessary to use a particular total asthe denominator in a PROC TABULATE percentage

calculation, then a denominator definition must be specified inthe TABLE statement. To specify how the percentage is to becalculated, the “less-than” (<) and “greater-than” (>) inequalitysigns are used as brackets to enclose an expression whichtells PROC TABULATE how to identify the appropriatedenominator. Now, this probably doesn’t work the way youmight think that it should, so pay close attention to theexamples, below.

Here are some examples involving percentage calculations:

PROC TABULATE DATA=SATURN; CLASS YEAR MODEL TRANSM; VAR PRICE;

LABEL TRANSM = ’Transmission’; KEYLABEL N=’Count’ ALL=’Total’ SUM=’$’

PCTN=’% of Autos’ PCTSUM=’% of $’; TABLE YEAR*MODEL ALL,

(TRANSM ALL)*(N*F=5.0 PCTN*F=6.2); TABLE MODEL ALL,

(TRANSM ALL)*PRICE*(SUM PCTSUM); TABLE YEAR*MODEL, TRANSM

*(N*F=5.0 PCTN*F=6.2); TABLE YEAR*MODEL, TRANSM

*(N*F=5.0 PCTN*F=6.2); TABLE YEAR*MODEL, TRANSM*PRICE

*(SUM PCTSUM*F=6.2); TABLE YEAR*MODEL, TRANSM*PRICE

*(SUM PCTSUM*F=6.2);The first two TABLE statements do not contain an explicitdenominator definition, therefore the percentage calculationsare carried out with respect to the entire data set. The firstTABLE statement would generate a breakdown of the countsand frequency percentages for each combination of values ofYEAR, MODEL, and TRANSM, where the percentages arecalculated with respect to the entire data set. The secondTABLE statement produces a breakdown of the sum of thevalues of PRICE and percentage of the sum of PRICE, foreach combination of values of YEAR, MODEL, and TRANSM,and where the percentages are calculated with respect to theentire data set.

The third TABLE statement would generate breakdowns of thecounts and the associated frequency percentages, for eachcombination of values of YEAR, MODEL, and TRANSM,

where the percentages are calculated in a column-wise manner(using the combined counts for each value of TRANS, acrossall values of combinations of YEAR and MODEL, as thedenominator for all cells in that column). The fourth TABLEstatement would generate breakdowns of the counts and theassociated frequency percentages, for each combination ofvalues of YEAR, MODEL, and TRANSM, where the

percentages are calculated in a row-wise manner (using thetotal counts for each particular combination of YEAR and

MODEL, across all values of TRANSM, as the denominator forall cells in that row).

The fifth TABLE statement would generate breakdowns of thesum of the values of PRICE and percentage of the sum of

PRICE, for each combination of values of YEAR, MODEL, andTRANSM, where the percentages are calculated in a column-wise manner (using the total of PRICE for each value ofTRANS, across all values of combinations of YEAR andMODEL, as the denominator for all cells in that column).

Finally, the sixth TABLE statement would generate breakdownsof the sum of the values of PRICE and percentage of the sumof PRICE, for each combination of values of YEAR, MODEL,and TRANSM, where the percentages are calculated in a row-wise manner (using the total of PRICE for each particularcombination of YEAR and MODEL, across all values ofTRANSM, as the denominator for all cells in that row).

Hands-on WorkshopsObserve that, to obtain percentages by row, we use thecolumn-expression in the “denominator definition”; and toobtain percentages by column, we use the row-expression inthe “denominator definition”.Notice that PROC TABULATE allows the use of denominatordefinitions which contain crossings. However, PROC

TABULATE does not permit denominator definitions which

contain groupings which are defined by the use of parentheses.Following are a couple of examples which illustrate the use ofthe universal class variable, ALL, in a denominator definition forpercentage calculations. Notice that whenever row- or column-percentages are to be produced for a column- or row-expression which include ALL, then ALL also must be includedin the denominator definition.

PROC TABULATE DATA=SATURN; CLASS YEAR MODEL TRANSM; VAR PRICE;

LABEL TRANSM = ’Transmission’; KEYLABEL N=’Count’ ALL=’Total’ SUM=’$’

PCTN=’% of Autos’ PCTSUM=’% of $’;

TABLE YEAR*MODEL ALL, (TRANSM ALL)

*(N*F=5.0 PCTN*F=6.2); TABLE YEAR*MODEL ALL, (TRANSM ALL)*PRICE

*(SUM PCTSUM*F=6.2);

CUSTOMIZING THE APPEARANCE OFTABLES

PROC TABULATE provides several ways for customizing theappearance of tables. We have already discussed some of theways of changing the default headings which are used in thetables. You also may want to use formats to change theheadings which correspond to values of a class variable.The default for displaying cells with missing numeric values isa period. You can change the way missing values aredisplayed by using the MISSTEXT= option to define up totwenty characters of text which will print in the table cellswhenever a particular combination of class variable values isnot found in the input data set.

PROC TABULATE DATA=SATURN; CLASS YEAR MODEL TRANSM; TABLE YEAR*MODEL ALL,

TRANSM=’Transmission’ ALL / MISSTEXT=’0’;

TITLE ‘Saturn Automobiles in Used Car Inventory’;Formats can be used to substitute labels for values of theclassification variables. Formats also can be used to combinemany values of the classification variables into a much smallernumber of values to be printed in the report. We create customformats by using PROC FORMAT, and we invoke thoseformats in PROC TABULATE either through a FORMATstatement, or by crossing F=format-name. in the TABLEstatement with the particular variable.

PROC FORMAT;

VALUE ODOMFMT low -9999 = ‘less than 10,000’ 10000-34999 = ’10,000 to 34,999’ 35000-59999 = ’35,000 to 59,999’ 60000-high = ’at least 60,000’;PROC TABULATE DATA=SATURN FORMAT=8.0; CLASS YEAR MODEL MILEAGE; KEYLABEL N=’Number of Autos’

ALL=’All Autos Combined’; FORMAT MILEAGE ODOMFMT.; TABLE YEAR*MODEL ALL,

MILEAGE=’Odometer Reading’ ALL;

TITLE ‘Saturn Automobiles in Used Car Inventory’;The RTSPACE= (or RTS=) option defines the total amount ofspace for the row headings. If there are several levels of

headings for row, then the space is divided equally among thelevels, after subtracting the spaces which are needed for thevertical lines.

Whenever a table produced by PROC TABULATE is too wideto fit on a single page, the procedure automatically splits thetable, to span as many separate pages as are necessary forprinting. For short, wide tables, the CONDENSE option couldbe specified on the TABLE statement, in order to print as manylogical pages as possible on a single page, one below theother.

The FORMCHAR= option in the PROC TABULATE statementmay be used to change the characters which are used foroutlining the table. To replace all of the outlining characterswith blanks, use the option

FORMCHAR(1,2,3,4,5,6,7,8,9,10,11)=’ ‘(that is, 11 blanks, enclosed in quotes).USING PROC TABULATE FOR DISPLAYINGSTATISTICAL RESULTS

Sometimes we may desire to produce tables which includestatistics which are not calculated by PROC TABULATE, orwe may want to combine several statistics in a single table. Ingeneral, either of these situations will require the use of astatistical PROC which is capable of writing its results to anoutput DATA set (such as PROC MEANS, PROC

SUMMARY, PROC UNIVARIATE, or PROC CORR), and/oran intermediate DATA step which restructures the data into aform which PROC TABULATE is able to utilize.

Consider the SAS data file, PRDSALE, from the SASHELPdata library. PRDSALE contains 1,440 observations, includingpredicted and actual sales data for various products, divisions,and locations, covering the period of January 1993 throughDecember 1994. Here is some information from the PROCCONTENTS output for this data set:

# Variable Type Len Pos Format Label

---------------------------------------------------------------------------------1 ACTUAL Num 8 0 DOLLAR12.2 Actual Sales3 COUNTRY Char 10 40 $CHAR10. Country5 DIVISION Char 10 60 $CHAR10. Division10 MONTH Num 8 32 MONNAME3. Month

2 PREDICT Num 8 8 DOLLAR12.2 Predicted Sales6 PRODTYPE Char 10 70 $CHAR10. Product type7 PRODUCT Char 10 80 $CHAR10. Product8 QUARTER Num 8 16 8. Quarter4 REGION Char 10 50 $CHAR10. Region9 YEAR Num 8 24 4. Year

Hands-on WorkshopsLet us suppose that we are interested in examining the

accuracy of the predictions, by computing the percentage error[PCTERR = 100*(PREDICT-ACTUAL)/ACTUAL] for eachvalue of PRODUCT, and in each month. This statistic is notcalculated by PROC TABULATE. We could calculate it in aDATA step, or with some combination of a PROC step and aDATA step, and then use TABULATE to write the report; or wecould “trick” PROC TABULATE into doing most of the workanyway, by using an intermediate DATA step just before aPROC TABULATE step with a carefully coded TABLEstatement. Let us explore both of these approaches.

PROC SUMMARY DATA=SASHELP.PRDSALE NOPRINT NWAY;

CLASS PRODUCT YEAR MONTH; VAR ACTUAL PREDICT;

OUTPUT OUT=SUMRY SUM=;

DATA SUMRY; SET SUMRY;

PCTERR = 100*(PREDICT-ACTUAL)/ACTUAL;PROC TABULATE DATA=SUMRY; CLASS PRODUCT YEAR MONTH; VAR PREDICT ACTUAL PCTERR; KEYLABEL SUM=’ ‘;

TABLE PRODUCT, YEAR*MONTH, PREDICT ACTUAL PCTERR; TITLE ‘Accuracy of Sales Predictions in’ ‘ SASHELP.PRDSALE Data Set’;Here is another approach to the same problem:

DATA SALES;

SET SASHELP.PRDSALE;

ERR100 = (PREDICT-ACTUAL)/100;PROC TABULATE DATA=SALES FORMAT=9.0; CLASS PRODUCT YEAR MONTH; VAR PREDICT ACTUAL ERR100; KEYLABEL SUM=’ ‘

PCTSUM=’ ‘;

LABEL ERR100 = ‘Percent Error’; TABLE PRODUCT, YEAR*MONTH, PREDICT ACTUAL

ERR100*(PCTSUM) *F=PERCENT9.3;

TITLE ‘Accuracy of Sales Predictions in’ ‘ SASHELP.PRDSALE Data Set’;

The preceding example also illustrates something else aboutPROC TABULATE. Strictly speaking, TABULATE does notcompute “pure” quotients of sums; however, it does calculatethe percentage that one sum represents of another sum

(PCTSUM). Therefore, if we want to use PROC TABULATEto compute quotients of sums, then pre-divide the summandsof the numerator sums by 100, and then use PROC

TABULATE to cross the numerator sums with PCTSUM,

where the denominator definition is the analysis variable for theintended denominator sums. The reason for pre-dividing by100 is to ensure that the decimal point is correctly positioned inthe quotient.

We have already noted that, in a TABLE statement, the

dimensions of the table are separated by commas. If all of thevariables to be used in the table occur in the same dimension,and if there is no comma, then PROC TABULATE interpretsthe TABLE statement to mean that the table’s only dimension

would be the column dimension. So, the table would only haveone row, and the table would be a horizontal display, with eachvariable occupying its own column. Well, suppose that avertical display is desired. Maybe you will be directed toprepare a tabular report in which various statistics are to bearranged in a single column. What then? A simple illustrativeexample follows.

Let us aggregate the PRDSALE data according to product,using a separate variable for the sales corresponding to eachvalue of PRODUCT.

DATA PRODS;

SET SASHELP.PRDSALE;

KEEP YEAR MONTH BEDS CHAIRS DESKS SOFAS TABLES;

IF PRODUCT = ‘BED’ THEN BEDS = ACTUAL; ELSE IF PRODUCT = ‘CHAIR’ THEN CHAIRS = ACTUAL; ELSE IF PRODUCT = ‘DESK’ THEN DESKS = ACTUAL; ELSE IF PRODUCT = ‘SOFA’ THEN SOFAS = ACTUAL; ELSE IF PRODUCT = ‘TABLE’ THEN TABLES = ACTUAL;

PROC TABULATE DATA=PRODS;

VAR BEDS CHAIRS DESKS SOFAS TABLES; KEYLABEL SUM=’$’;

TABLE (BEDS CHAIRS DESKS

SOFAS TABLES)*SUM, ALL = ‘ ‘; TITLE ‘Total Sales for Each Type of Product’ ‘ in SASHELP.PRDSALE Data Set’;

In the example, the combined (over all time periods in the dataset) sales for each of the product types are crossed with ALL= ‘‘. This produces the desired vertical display, instead of thehorizontal arrangement which one might expect.

VERSION 7 ENHANCEMENTS

Release 7.00 of the SAS System includes several

enhancements concerning PROC TABULATE. Using theCLASSDATA= option, you can specify which combinations ofclass variables are to be included in the analysis. Now, youalso can name an output data set from PROC TABULATE.PROC TABULATE has an expanded set of available statistics,including quartiles, and the most commonly-requestedpercentiles. Version 7 includes the new Output DeliverySystem, which allows you to create customized HTML filesfrom PROC TABULATE.

CONCLUSION

PROC TABULATE is a very useful and very powerfulprocedure for constructing tabular reports containingdescriptive statistical information, including hierarchical

relationships among variables. It is well-worth the necessaryinvestment of time and effort for learning the intricacies andsubtleties of its syntax.

Hands-on WorkshopsREFERENCES:

• Jeffery M. Abolafia & Stephen M. Noga, “The TABULATEProcedure: One Step Beyond the Final Chapter”,

• SESUG ‘97 Proceedings (1997), pp. 293-300; and• Proceedings of the Twenty-Third Annual SAS UsersGroup International Conference (1998), pp. 839-844.• Dan Bruns, “The Utter Simplicity of the TABULATEProcedure”,

• Proceedings of the Sixteenth Annual SAS Users GroupInternational Conference (1991), pp. 365-371; and• Proceedings of the Twentieth Annual SAS Users GroupInternational Conference (1995), pp. 363-368; and• Proceedings of the Seventh Annual South-Central SASUsers’ Conference (1997), pp. 54-60.• Dan Bruns, “The Utter ‘Simplicity?’ of the TABULATEProcedure”, Proceedings of the Seventeenth Annual SASUsers Group International Conference (1992), pp. 216-220.• Dan Bruns, “Advanced Features of PROC TABULATE -- or -- The Utter Simplicity of the TABULATE Procedure - TheSequel”, Proceedings of the Twenty-First Annual SASUsers Group International Conference (1996), pp. 242-247.• Dan Bruns, “The Utter ‘Simplicity?’ of the TABULATEProcedure -- The Final Chapter”,

• Proceedings of the Twenty-Second Annual SAS UsersGroup International Conference (1997), pp. 251-256 ;and

• Proceedings of the Seventh Annual South-Central SASUsers’ Conference (1997), pp. 61-66.• Michele M. Burlew, “A Stepwise Approach to Using PROCTABULATE”, Chapter 5 of Reporting From the Field (SASInstitute Inc., 1994), pp. 67-88.

• SAS Institute Inc., “The TABULATE Procedure”, Chapter 37of SAS Procedures Guide, Version 6, Third Edition (1990).• SAS Institute Inc., SAS Guide to TABULATE Processing,Second Edition (1990).• SAS Institute Inc., “Summary Reports -- PROC

TABULATE”, Chapter 5 of SAS Views: SAS Report Writing(1987).

• Bob Virgile, “The Right Approach to Learning PROC

TABULATE”, SESUG ‘97 Proceedings (1997), pp. 189-195.• Tom Winn, “Introduction to Using PROC TABULATE”,

Proceedings of the Eighth Annual South-Central SAS Users’Conference (1998), pp.316-326.SAS is a registered trademark of SAS Institute Inc. in the USA andother countries. ® indicates USA registration.

AUTHOR INFORMATION:

Thomas J. Winn Jr.

Fiscal Management Support,Comptroller of Public AccountsL.B.J. State Office Building111 E 17th StreetAustin, TX 78774

Telephone: (512) 463-4907

E-Mail: tom.winn@cpa.state.tx.us

因篇幅问题不能全部显示,请点此查看更多更全内容