Topics Covered in this Session
- Statistical Software Packages
- SPSS - Statistical Package for the Social Sciences
Statistical Software Packages
It has been estimated that 90% of all statistical analysis is performed by computers. The computer can perform statistical calculations more easily, more quickly and more accurately than people using calculators or other devices. The time saved in using computers can be devoted to the more important task of data analysis. Although computers do not make mistakes, at least most of the time, people who write programs, collect data and enter data can and do make them. If poor data or faulty programs are introduced into the computer, the results are likely to contain errors and may be meaningless. Serious researchers have generally come to see the computer as an invaluable tool of their trade and anybody attempting to do quantitative studies would be wise to take the time to become proficient in using a statistical computer software package. Given the proliferation of microcomputers in schools and colleges, equipment is generally available with the appropriate statistical software packages for prospective researchers to practice and to learn how to use them. It is not the intent here to provide a training program in the use of statistical software. Most of the popular software packages come with their own tutorials and help facilities and do a much better job than can be accomplished here. Instead, some of the most important features of the very popular SPSS package will be presented.
Statistical Package for the Social Sciences (SPSS)
SPSS is probably the most popular statistical software package ever developed. It is available in practically every college computer center and is increasingly being used in private businesses even though it was originally designed for use by social scientists. The Statistical Analysis System (SAS) is another very popular statistical software package used by researchers. Either one is appropriate for the vast majority of statistical applications used in educational research studies.
In this primer, SPSS will be used to demonstrate the basic components of most statistical software packages.
The first step in preparing and organizing data for computer processing is determining what data is needed, from what source(s) and how it will be collected. These questions are best answered by the researcher and depend upon the nature of the project. The sample size (i.e. 2000, 200, 20), the number of data elements or variables (i.e. 100, 50, 10), and the distribution of the study population (i.e. national population, one school, one class), will all have a significant impact on data collection activities. Generally, the larger the sample size or the more data elements needed or the wider the distribution of the population; the more difficult it is to collect and to control the accuracy of data. On the other hand, the smaller the sample size or the fewer data elements needed or the narrower the distribution of the population; the easier it is to collect and to control the accuracy of data. In preparing data for computer processing, it is necessary to identify a data format or layout for the cases to be used in the study. The data format should be well-planned and carefully done. Once established and data is collected, it rarely is ever changed. The major purposes of a data format are to define:
1. what type of data (i.e. numeric, alphabetic, etc.) is being used for each data element;
2. how many characters are being used for each data element;
3. where each data element is located in the data record.
Below is a sample (partial) of a data format statement which was used with SPSS to analyze data for a study on the use of microcomputers in New York public schools. It was based on a survey mailed to building principals.
DATA LIST FILE = 'SURVEY01.DAT'/
ID 1-3 SCHOOL 4-5(A) REGION 6-11(A) LEVEL 12 ENROLL 13-16 PERSON 17 MICROS 18-20 LOCATION 21.
VALUE LABELS LEVEL 1 'ES' 2 'MS' 3 'HS' 4 'OTHER'.
VALUE LABELS PERSON 1 'FT ADMIN' 2 'FT TCHR' 3 'FT COORD' 4 'PT COORD'.
VALUE LABELS LOCATION 1 'CENTRAL' 2 'CLASS' 3 'BOTH'.
COMPUTE STUDMICR = ENROLL/MICROS.
For students who have some familiarity with computers, the DATA LIST statement should appear as being straight-forward and very similar to data formats they have used or seen with other computer software packages.
Once a DATA LIST statement has been established, the researcher needs to collect the data and enter it into the computer. There are many ways to accomplish this. Most researchers will use some type of database or word processing software. A sample of the actual data used with the above DATA LIST statement appears below. There were a total of 136 cases in this study. The actual data for cases 1,2,3 and 136 are shown.
-----Character Position------------
123456789012345678901234567890
109TEURBAN 2065010453 - (Case 1)
234JJNONURB3120020903 - (Case 2)
112BSURBAN 1044040251 - (Case 3)
211WIURBAN 3202011003- (Case 136)
- Data Analysis
SPSS provides a number of statistical procedural programs for doing a wide variety of analyses. A partial list of the most commonly used programs are as follows:
o ANOVA - Analysis of Variance
o CORRELATION - Correlational Analysis (i.e. Pearson's Product Moment Coefficient)
o CROSSTABS - Crosstabulations, Chi-Square
o FREQUENCIES - Frequency Distributions, Graphs, Charts
o MEANS - Measures of Central Tendency (i.e. Means)
o ONEWAY - Oneway Analysis of Variance
o PLOT - Plot Regression Lines
o REGRESSION - Regression Analysis
o T-TEST - t-test
FOR MORE INFORMATION ON THE TOPICS COVERED IN THIS SESSION, PLEASE REFER TO THE APPENDIX IN A.G. PICCIANO "EDUCATIONAL RESEARCH PRIMER" AS WELL AS THE MANUALS AND DOCUMENTATION PROVIDED BY SPSS, INC. |