IBM® SPSS® Statistics Base is statistical analysis software that delivers the core capabilities you need to take the analytical process from start to finish. It is easy to use and includes a broad range of procedures and techniques to help you increase revenue, outperform competitors, conduct research and make better decisions.
SPSS Statistics Base provides essential statistical analysis tools for every step of the analytical process.
● A comprehensive range of statistical procedures for conducting accurate analysis.
● Built-in techniques to prepare data for analysis quickly and easily.
● Sophisticated reporting functionality for highly effective chart creation.
● Powerful visualization capabilities that clearly show the significance of your findings.
● Support for all types of data including very large data sets.
A comprehensive range of statistical procedures
● Carry out a wide range of descriptive procedures including crosstabulations, frequencies, compare means and
correlation.
● Predict numerical outcomes and identify groups using factor analysis, cluster analysis, linear regression,
ordinal regression, discriminant analysis and Nearest Neighbor analysis.
● Apply Monte Carlo simulation techniques to build better models and assess risk when inputs are uncertain.
● Use SPSS Statistics Base with other modules, such as IBM SPSS Regression and IBM SPSS Advanced Statistics,
to more accurately identify and analyze complex relationships.
Built-in techniques
● Identify and eliminate duplicate cases and restructure your data files prior to analysis.
● Set up data dictionary information (for example, value labels and variable types) and use it as a template to
prepare all of your data for analysis faster.
● Open multiple data sets within a single session to save time and condense steps.
Sophisticated reporting functionality
● Create commonly used charts, such as scatterplot matrices, histograms and population pyramids, more easily.
● Drag and drop variables and elements onto a chart creation canvas and preview the chart as it is being built.
● Build a chart once, and then use those specifications to create hundreds more just like it.
Powerful visualization capabilities
● Distribute and manipulate information for ad hoc decision-making using report online analytical processing
(OLAP) technology.
● Create high-end charts and graphs to aid analysis and reporting and identify new insights in your data.
● Use pre-built map templates to generate a geographic or demographic analysis that can provide critical
information for decision making.
● Quickly change information and statistics in graphs for new levels of understanding, and convert a table
to a graph with just a few mouse clicks.
Support for all types of data
● Access, manage and analyze any kind of data set including survey data, corporate databases,
data downloaded from the web and IBM Cognos Business Intelligence data.
● Eliminate variability in data due to language-specific encoding and view,
analyze and share data written in multiple languages using built-in Unicode support.
IBM® SPSS® Conjoint gives you a realistic way to measure how individual product attributes affect people's preferences.
When you use both conjoint analysis and competitive product market research for your new products, you are less likely to overlook product dimensions that
are important to your customers or constituents, and more likely to successfully meet their needs.
With IBM SPSS Conjoint, you can easily measure the tradeoff effect of each product attribute in the context of a set of product attributes - as consumers do when making
purchasing decisions.
For example, you can answer critical product market research questions:
● What product attributes do my customers care about?
● What are the most preferred attribute levels?
● How can I most effectively perform pricing and brand equity studies?
You can answer all of your questions before you spend valuable resources trying to bring products or services to market. Use IBM SPSS Conjoint to focus your efforts
on the service or product development that has the best chance of succeeding.
IBM SPSS Conjoint gives you all the tools you need for developing product and service attribute ratings. You can use its three procedures to:
● Generate designs easily - use Orthoplan, the design generator, to produce an orthogonal array of alternative potential
products or services that combine different product/service features at specified levels
● Print "cards" to elicit respondents' preferences - use Plancards to quickly generate cards that respondents can sort to
rank alternative products
● Get informative results - analyze your data using Conjoint, a procedure that's a specially tailored version of regression.
Find out which product/service attributes
are important and at which levels they are most preferred. You can also perform simulations that tell you the market share of preference for alternative products.
Conduct intelligent planning
Expand the capabilities of IBM SPSS Statistics Base with IBM SPSS Conjoint. Make better decisions about your data and gain knowledge in the planning stage that you
can carry throughout the analytical process.
Save time and money by generating a set of conjoint experimental trials that are a fraction of all possible combinations and
attribute levels. You'll quickly learn how your respondents rank their preferences when you create and print cards they can sort. And,
with the results from the Conjoint procedure, you'll learn how your respondents rank product attributes. Here are more details on each procedure:
● Orthoplan enables you to generate orthogonal main effects fractional factorial designs and display results in pivot tables.
● Plancards enables you to produce printed cards for a conjoint experiment.
● Conjoint enables you to perform an ordinary least-squares analysis of preference or rating, working with a plan file generated through
Plancards or with one inputted from a data list. Various graphing and printing options are available.
IBM® SPSS® Exact Tests gives you what's needed to more accurately work with small samples and analyze rare occurrences in large datasets.
IBM SPSS Exact Tests enables you to use small samples and still feel confident about the results. With the money saved using smaller sample sizes,
you can conduct surveys or test direct marketing programs more often. Stay ahead of the competition by using these resources to find new opportunities.
Easily Interpret and Apply Exact Tests
IBM SPSS Exact Tests is easy to use. You can perform a test any time, with just a click of a button - during your original analysis or when you rerun it. With IBM SPSS Exact Tests,
there is no steep learning curve, because you don't need to learn any new statistical theories or procedures. You simply interpret the exact tests results the same way
you already interpret the results in IBM SPSS Statistics Base.
You'll always have the right statistical test for your data situation. IBM SPSS Exact Tests provides more than 30 exact tests, which cover the entire spectrum of
nonparametric and categorical data problems for small or large datasets. These tests include one-sample, two-sample and K-sample tests on independent or related samples,
goodness-of-fit tests, tests of independence in RxC contingency tables and on measures of association.
And, with the release of IBM SPSS Statistics 19, both the client and server versions of IBM SPSS Exact
Tests are available on Mac® and Linux®, as well as on Windows® operating systems.
More Statistics for Data Analysis
Expand the capabilities of IBM SPSS Statistics Base for the data analysis stage in the analytical process. Using IBM SPSS Exact Tests with IBM SPSS Statistics
Base gives you an even wider range of statistics, so you can get the most accurate response when:
● Working with a small number of cases
● Working with variables that have a high percentage of response in one category
● Dividing your data into fine breakdowns
● Searching for rare occurrences in large datasets (such as sales above $1 million)
IBM SPSS Exact Tests easily plugs into other IBM SPSS Statistics modules so you can seamlessly work in the IBM SPSS Statistics environment.
Get greater value from your data: with IBM SPSS Exact Tests, you can slice and dice your data into breakdowns, which can be as fine as you want, so you learn more by extending
your analysis to subgroups. You aren't limited by required expected counts of five or more per cell for correct results. And you can even rely on IBM SPSS Exact Tests when you're
searching for rare occurrences within large datasets.
Keep your original categories: don't lose valuable information by collapsing categories to meet the assumptions of traditional tests. With IBM SPSS Exact Tests, you can keep
your original design or natural categories-for example, regions, income, or age groups-and analyze what you intended to analyze.
IBM SPSS Exact Tests has the tests and statistics you need get the more insight from your small samples and rare occurrences within large databases. These procedures include:
IBM® SPSS® Advanced Statistics provides univariate and multivariate modeling techniques to help users reach the most accurate conclusions when working with data describing
complex relationships. These sophisticated analytical techniques are frequently applied to gain deeper insights from data used in disciplines such as medical
research, manufacturing, pharmaceuticals and market research.
SPSS Advanced Statistics provides the following capabilities:
● General linear models (GLM) and mixed models procedures.
● Generalized linear models (GENLIN) including widely used statistical models, such as linear regression for normally distributed
responses, logistic models for binary data and loglinear models for count data.
● Linear mixed models, also known as hierarchical linear models (HLM), which expands the general linear models used in
the GLM procedure so that you can analyze data that exhibit correlation and non-constant variability.
● Generalized estimating equations (GEE) procedures that extend generalized linear models to accommodate correlated longitudinal
data and clustered data.
● Generalized linear mixed models (GLMM) for use with hierarchical data and a wide range of outcomes, including ordinal values.
● Survival analysis procedures for examining lifetime or duration data.
General linear models (GLM)
● Describe the relationship between a dependent variable and a set of independent variables. Models include linear regression, analysis of variance (ANOVA), analysis of
covariance (ANCOVA), multivariate analysis of variance (MANOVA) and multivariate analysis of covariance (MANCOVA).
● Use flexible design and contrast options to estimate means and variances and to test and predict means.
● Mix and match categorical and continuous predictors to build models, choosing from many model-building possibilities.
● Use linear mixed models for greater accuracy when predicting nonlinear outcomes, such as what a customer is likely to buy, by taking into account hierarchical and nested data
structures.
● Formulate dozens of models, including split-plot design, multi-level models with fixed-effects covariance and randomized complete blocks design.
Generalized linear models (GENLIN)
● Provide a unifying framework that includes classical linear models with normally distributed dependent variables, logistic
and probit models for binary data, and loglinear models for count data, as well as various other nonstandard regression-type models.
● Apply many useful general statistical models including ordinal regression, Tweedie regression,
Poisson regression, Gamma regression and negative binomial regression
Linear mixed models/hierarchical linear models (HLM)
● Model means, variances and covariances in data that display correlation and non-constant variability, such as students nested
within classrooms or consumers nested within families.
● Formulate dozens of models, including split-plot design, multi-level models with fixed-effects covariance, and randomized complete
blocks design.
● Select from 11 non-spatial covariance types, including first-order ante-dependence, heterogeneous, and first-order
autoregressive.
● Get more accurate results when working with repeated measures data, including situations in which there are different numbers of
repeated measurements, different intervals for different cases, or both.
Generalized estimating equations (GEE) procedures
● Extend generalized linear models to accommodate correlated longitudinal data and clustered data.
● Model correlations within subjects.
Generalized linear mixed models (GLMM)
● Access, manage and analyze virtually any kind of data set including survey data, corporate databases or data downloaded from the
web.
● Run the GLMM procedure with ordinal values to build more accurate models when predicting nonlinear outcomes such as whether a
customer's satisfaction level will fall under the low, medium or high category.
Survival analysis procedures
● Choose from a flexible and comprehensive set of techniques for understanding terminal events such as part failure, death or survival
rates.
● Use Kaplan-Meier estimations to gauge the length of time to an event.
● Select Cox regression to perform proportional hazard regression with time-to-response or duration response as the dependent
variable.
IBM® SPSS® Custom Tables helps you easily understand your data and quickly summarize your results in different styles for different audiences.
More than a simple reporting tool, IBM SPSS Custom Tables combines comprehensive analytical capabilities with interactive table-building features to help you learn from
your data and communicate the results of your analyses as professional-looking tables that are easy to read and interpret.
●Compare means or proportions for demographic groups, customer segments, time periods or other categorical variables when you include inferential statistics
● Select summary statistics - from simple counts for categorical variables to measures of dispersion - and sort categories by any summary statistic used
●Choose from three significance tests: Chi-square test of independence, comparison of column means (t test), or comparison of column proportions (z test)
● Drag and drop variables onto the interactive table builder to create results as pivot tables
●Preview tables in real time and modify them as you create them
●Exclude specific categories, display missing value cells and add subtotals to your tables
● Export tables to Microsoft® Word, Excel®, PowerPoint® or HTML for use in reports
IBM SPSS Custom Tables is an analytical tool that helps you augment your reports with information your readers need to make more informed decisions.
Use inferential statistics-also known as significance testing-in your tables to perform common analyses: Compare means or proportions for demographic groups,
customer segments, time periods, or other categorical variables; and identify trends, changes, or major differences in your data. IBM SPSS Custom Tables
includes the following significance tests:
● Chi-square test of independence
●Comparison of column means (t test)
●Comparison of column proportions (z test)
You can also choose from a variety of summary statistics, which include everything from simple counts for categorical variables to measures of dispersion.
Summary statistics are included for:
● Categorical variables
●Multiple response sets
●Scale variables
● Custom total summaries for categorical variables
When your analysis is complete, you can use IBM SPSS Custom Tables to create customized tabular reports suitable for a variety
of audiences-including those without a technical background.
Preview your tables during production
With IBM SPSS Custom Tables' interactive table preview builder (see the figure below), you see how your tables look as you create or modify them. You can make adjustments
along the way, ensuring that your final table is exactly what you need. For example, you can:
● Move variables from row to column
●Collapse large or complex tables for a more concise view
● View variable categories while you work with them-and before you add them to your tables
● Preview the arrangement of variables (e.g., dimensions, stacking, or nesting), the categories of each variable, and requested statistics
Customize table layout and appearance
IBM SPSS Custom Tables gives you a high level of control over the layout and appearance of your tables. For example, you can:
● Create totals and subtotals without changing your data file
●Make your tables more precise by changing variable types or excluding categories
●Sort categories by any summary statistic in your table and hide the categories that comprise subtotals
●Select from 16 pre-formatted styles in TableLooks^{TM} (found within IBM SPSS Statistics Base) or create your own styles. Through Statistics Base,
you can also add boldface lines; draw lines; align left, right, or center; and specify titles and footnotes.
Make your results available easily
IBM SPSS Custom Tables helps you improve your workflow by producing all results as IBM SPSS pivot tables. You can easily export them to Microsoft® Word or Excel®
with your formatting intact, so your information gets to the right people-quickly and accurately.
IBM® SPSS® Regression enables you to predict categorical outcomes and apply a wide range of nonlinear regression procedures.
You can apply IBM SPSS Regression to many business and analysis projects where ordinary regression techniques are limiting or inappropriate: for example,
studying consumer buying habits or responses to treatments, measuring academic achievement, and analyzing credit risks.
IBM SPSS Regression includes the following procedures:
● Multinomial logistic regression: Predict categorical outcomes with more than two categories
● Binary logistic regression: Easily classify your data into two groups
● Nonlinear regression and constrained nonlinear regression (CNLR): Estimate parameters of nonlinear models
● Weighted least squares: Gives more weight to measurements within a series
● Two-stage least squares: Helps control for correlations between predictor variables and error terms
● Probit analysis: Evaluate the value of stimuli using a logit or probit transformation of the proportion responding
More Statistics for Data Analysis
Expand the capabilities of IBM® SPSS® Statistics Base for the data analysis stage in the analytical process. Using IBM SPSS Regression with IBM SPSS Statistics Base
gives you an even wider range of statistics so you can get the most accurate response for specific data types.
IBM SPSS Regression includes:
● Multinomial logistic regression (MLR): Regress a categorical dependent variable with more than two categories on a set of independent variables. This procedure
helps you accurately predict group membership within key groups.
You can also use stepwise functionality, including forward entry, backward elimination, forward stepwise or backward stepwise,
to find the best predictor from dozens of possible predictors. If you have a large number of predictors, Score and Wald methods can help you more quickly reach results.
You can access your model fit using Akaike information criterion (AIC) and Bayesian information criterion (BIC; also called Schwarz Bayesian criterion, or SBC).
● Binary logistic regression: Group people with respect to their predicted action. Use this procedure if you need to build models in which the dependent
variable is dichotomous (for example, buy versus not buy, pay versus default, graduate versus not graduate). You can also use binary logistic
regression to predict the probability of events such as solicitation responses or program participation.
With binary logistic regression, you can select variables using six types of stepwise methods, including forward (the procedure selects the strongest variables until
there are no more significant predictors in the dataset) and backward (at each step, the procedure removes the least significant predictor in the dataset) methods.
You can also set inclusion or exclusion criteria. The procedure produces a report telling you the action it took at each step to determine your variables.
● Nonlinear regression (NLR) and constrained nonlinear regression (CNLR): Estimate nonlinear equations.
If you are you working with models that have nonlinear relationships, for example, if you are predicting coupon redemption as a function of time and number of
coupons distributed, estimate nonlinear equations using one of two IBM SPSS Statistics procedures: nonlinear regression (NLR) for unconstrained problems and
constrained nonlinear regression (CNLR) for both constrained and unconstrained problems.
NLR enables you to estimate models with arbitrary relationships between independent and dependent variables using iterative estimation algorithms, while CNLR enables you to:
◘ Use linear and nonlinear constraints on any combination of parameters
◘ Estimate parameters by minimizing any smooth loss function (objective function)
◘ Compute bootstrap estimates of parameter standard errors and correlations
● Weighted least squares (WLS): If the spread of residuals is not constant, the estimated standard errors
will not be valid. Use Weighted Least Square to estimate the model instead (for example, when predicting stock values, stocks with higher shares values fluctuate
more than low value shares.)
● Two-stage least squares (2LS): Use this technique to estimate your dependent variable when the independent
variables are correlated with the regression error terms.
For example, a book club may want to model the amount they cross-sell to members using the amount that members spend on books as a predictor.
However, money spent on other items is money not spent on books, so an increase in cross-sales corresponds to a decrease in book sales.
Two-Stage Least-Squares Regression corrects for this error.
● Probit analysis: Probit analysis is most appropriate when you want to estimate the effects of one or more
independent variables on a categorical dependent variable.
For example, you would use probit analysis to establish the relationship between the percentage taken off a product, and whether a customer
will buy as the prices decreases. Then, for every percent taken off the price you can work out the probability that a consumer will buy the product.
IBM SPSS Regression includes additional diagnostics for use when developing a classification table.
IBM® SPSS® Forecasting enables analysts to predict trends and develop forecasts quickly and easily -- without being an expert statistician.
Reliable forecasts can have a major impact on your organization's ability to develop and implement successful strategies. Unlike spreadsheet programs,
IBM SPSS Forecasting has the advanced statistical techniques needed to work with time-series data regardless of your level of expertise.
● Analyze historical data and predict trends faster, and deliver information in ways that your organization's decision makers can understand and use
● Automatically determine the best-fitting ARIMA or exponential smoothing model to analyze your historic data
● Model hundreds of different time series at once, rather than having to run the procedure for one variable at a time
● Save models to a central file so that forecasts can be updated when data changes, without having to re-set parameters or re-estimate models
● Write scripts so that models can be updated with new data automatically
IBM SPSS Forecasting offers a number of capabilities that enable both novice and experienced users to quickly develop reliable forecasts using time-series data.
It is a fully integrated module of IBM SPSS Statistics, giving you all of IBM SPSS Statistics' capabilities plus features specifically designed to support forecasting.
New to Building Models from Time-series Data?
IBM SPSS Forecasting helps you by:
●Generating reliable models, even if you're not sure how to choose exponential smoothing parameters or ARIMA orders, or how to achieve stationarity
●Automatically testing your data for seasonality, intermittency and missing values, and selecting appropriate models
●Detecting outliers and preventing them from influencing parameter estimates
● Generating graphs showing confidence intervals and the model's goodness of fit
You're an Experienced IBM SPSS Statistics User?
IBM SPSS Forecasting allows you to:
●Control every parameter when building your data model
●Use IBM SPSS Forecasting Expert Modeler recommendations as a starting point or to check your work
Procedures and Statistics for Analyzing Time-series Data
Using IBM SPSS Forecasting with IBM SPSS Statistics Base gives you a selection of statistical techniques for analyzing time-series data and developing reliable forecasts.
Techniques Tailored to Time-series Analysis
IBM SPSS Statistics has the procedures you need to realize the most benefit from your time-series analysis. It generates statistics and normal
probability plots so that you can easily judge model fit. You can even limit output to see only the worst-fitting models -- those that require further examination.
Automatically generated high-resolution charts enhance your output.
Procedures available in IBM SPSS Forecasting include:
● TSMODEL - Use the Expert Modeler to model a set of time-series variables, using either ARIMA or exponential smoothing techniques
●TSAPPLY - Apply saved models to new or updated data
● SEASON - Estimate multiplicative or additive seasonal factors for periodic time series
●SPECTRA - Decompose a time series into its harmonic components, which are sets of regular periodic functions at different wavelengths or periods
IBM® SPSS® Data Preparation gives analysts advanced techniques to streamline the data preparation stage of the analytical process.
IBM SPSS Statistics Base, IBM SPSS Data Preparation provides specialized techniques to prepare your data for more accurate analyses and results.
With IBM SPSS Data Preparation, you can:
● Quickly identify suspicious or invalid cases, variables and data values
● View patterns of missing data
● Summarize variable distributions
● Optimally bin nominal data
● More accurately prepare your data for analysis
● Use Automated Data Preparation (ADP) to detect and correct quality errors and impute missing values in one efficient step
● Get recommendations and visualizations to help you determine which data to use
Features & Benefits
Expand your Data Preparation Techniques with IBM SPSS Data Preparation
Use the specialized data preparation techniques in IBM SPSS Data Preparation to facilitate data preparation in the analytical process.
IBM SPSS Data Preparation easily plugs into IBM SPSS Statistics Base so you can seamlessly work in the IBM SPSS environment.
Perform Data Checks
Data validation has typically been a manual process. You might run a frequency on your data, print the frequencies, circle what needs to be fixed and check for case IDs. This approach is time consuming and prone to errors.
And since every analyst in your organization could use a slightly different method, maintaining consistency from project to project may be a challenge.
To eliminate manual checks, use the IBM SPSS Data Preparation Validate Data procedure. This enables you to apply rules to perform data checks based on
each variable's measure level (whether categorical or continuous).
For example, if you're analyzing data that has variables on a five-point Likert scale, use the Validate Data procedure to apply a rule for five-point scales and flag all cases that have values outside of the 1-5 range. You can receive reports of invalid cases as well as summaries of rule violations and the number of cases affected.
You can specify validation rules for individual variables (such as range checks) and cross-variable checks (for example, "retired 30 year-olds").
With this knowledge you can determine data validity and remove or correct suspicious cases at your discretion before analysis.
Quickly Find Multivariate Outliers
Prevent outliers from skewing analyses when you use the IBM SPSS Data Preparation Anomaly Detection procedure. This searches for unusual cases based upon deviations from similar cases, and gives reasons for such deviations. You can flag outliers by creating a new variable.
Once you have identified unusual cases, you can further examine them and determine if they should be included in your analyses.
Pre-process Data before Model Building
In order to use algorithms that are designed for nominal attributes (such as Naive Bayes and logit models), you must bin your scale variables before model building. If scale variables aren't binned, algorithms such as multinomial logistic regression will take an extremely long time to process or they might not converge.
This is especially true if you have a large dataset. In addition, the results you receive may be difficult to read or interpret.
IBM SPSS Data Preparation Optimal Binning, however, enables you to determine cutpoints to help you reach the best possible outcome for algorithms designed for nominal attributes.
With this procedure, you can select from three types of binning for pre processing data:
● Unsupervised-- create bins with equal counts
● Supervised -- take the target variable into account to determine cutpoints.
This method is more accurate than unsupervised; however, it is also more computationally intensive.
● Hybrid approach -- combines the unsupervised and supervised approaches.
This method is particularly useful if you have a large number of distinct values.
IBM® SPSS® Bootstrapping makes it simple to test the stability and reliability of your models so that they produce accurate, reliable results.
Whether you conduct academic or scientific research, study issues in the public sector or provide the analyses that support business decisions, it's important
that your models are stable. Test model stability quickly and easily with IBM SPSS Bootstrapping.
IBM SPSS Bootstrapping provides an efficient way to ensure that your models are stable and reliable, so your analysis generates more accurate results.
With IBM SPSS Bootstrapping, you can:
● Quickly and easily estimate the sampling distribution of an estimator by re-sampling with replacement from the original
sample
● Estimate the standard errors and confidence intervals of a population parameter such as the mean, median, proportion,
odds ratio, correlation coefficient, regression coefficient, and numerous others
● Create thousands of alternate versions of your dataset for more accurate analysis
IBM SPSS Bootstrapping helps reduce the impact of outliers and anomalies that can degrade the accuracy or applicability of your analysis. As a result, you have a clearer
view of your data for creating the model you are working with.
● Fast, easy re-sampling -- estimate the sampling distribution of an estimator in a snap
● Reduce the impact of outliers and anomalies -- ensure the stability and reliability of your models
● Bootstrap many analytical procedures -- test a wide range of the descriptive and modeling procedures found in the IBM SPSS Statistics product family
IBM SPSS Bootstrapping works with a number of analytical procedures in the IBM SPSS Statistics product family, including:
IBM® SPSS® Categories provides you with all the tools you need to obtain clear insight into complex categorical and numeric data,
as well as high-dimensional data.
Use IBM SPSS Categories to understand which characteristics consumers relate most closely to your brand, or to determine customer perception of your products
compared to other products you or your competitors offer.
● Discover underlying relationships through perceptual maps, bi plots and tri plots
● Work with and understand nominal (e.g. salary) and ordinal (e.g. education level) data
with procedures similar to conventional
regression, principal components and canonical correlation to predict outcomes and reveal relationships
● Visually interpret datasets and see how rows and columns relate in large tables of scores, counts, ratings, rankings or similarities
● Deal with non-normal residuals in numeric data or nonlinear relationships between predictor variables
(e.g. customer or product attributes) and the outcome variable (e.g. purchase/non-purchase)
● Deal with non-normal residuals in numeric data or nonlinear relationships between predictor variables
(e.g. customer or product attributes) and the outcome variable (e.g. purchase/non-purchase)
Features & Benefits
Unleash the full potential of your data through predictive analysis, statistical learning, perceptual mapping, preference scaling
and dimension reduction techniques -including optimal scaling of your variables.
Graphically display underlying relationships
IBM SPSS Categories' dimension reduction techniques enable you to clarify relationships in your data by using perceptual maps and biplots:
● Perceptual maps are high-resolution summary charts that graphically display similar variables or categories close to each other.
They provide you with unique insight into relationships between more than two categorical variables.
● Biplots and triplots enable you to look at the relationships among cases, variables and categories.
For example, you can define relationships between products, customers and demographic characteristics.
By using the preference scaling feature, you can further visualize relationships among objects. The breakthrough algorithm on which this procedure is based enables you to perform non-metric analyses for ordinal data and obtain meaningful results.
The proximities scaling procedure allows you to analyze similarities between objects, and incorporate characteristics for objects in the same analysis.
Turn qualitative variables into quantitative ones
Perform additional statistical operations on categorical data with the advanced procedures available in IBM SPSS Categories:
● Use optimal scaling procedures to assign units of measurement and zero-points to your categorical data
● Choose from state-of-the art procedures for model selection and regularization
● Perform correspondence and multiple correspondence analyses to numerically evaluate similarities between two or more nominal variables in your dataset
● Summarize your data according to important components by using principal components analysis
● Quantify your ordinal and nominal variables with an optimal scaling correlation matrix
● Use nonlinear canonical correlation analysis to incorporate and analyze variables of different measurement levels
Procedures and statistics for analyzing categorical data
Using IBM SPSS Categories with IBM SPSS Statistics Base gives you a selection of statistical techniques for analyzing high-dimensional or categorical data, including:
● Categorical regression that predicts the values of a nominal, ordinal or numerical outcome variable from a
combination of categorical predictor variables. Optimal scaling techniques are used to quantify variables. Three regularization methods: Ridge regression, the Lasso and the Elastic Net,
improve prediction accuracy by stabilizing the parameter estimates.
● Correspondence analysis that enables you to analyze two-way tables that contain some measurement of correspondence between rows and columns,
as well as display rows and columns as points in a map.
● Multiple correspondence analysis which is used to analyze multivariate categorical data by allowing the use of more
than two variables in your analysis. With this procedure, all the variables are analyzed at the nominal level (unordered categories).
● Categorical principal components analysis uses optimal scaling to generalize the principal components analysis procedure so that
it can accommodate variables of mixed measurement levels.
● Nonlinear canonical correlation analysis uses optimal scaling to generalize the canonical correlation analysis procedure so that it can accommodate variables of mixed measurement levels. This type of analysis enables you to
compare multiple sets of variables to one another in the same graph, after removing the correlation within sets.
● Multidimensional scaling performs multidimensional scaling of one or more matrices with
similarities or dissimilarities (proximities).
● Preference scaling visually examines relationships between two sets of objects, for example, consumers and products.
Preference scaling performs multidimensional unfolding in order to find a map that represents the relationships between
these two sets of objects as distances between two sets of points.
IBM® SPSS® Decision Trees module helps you better identify groups, discover relationships between them and predict future events.
This module features highly visual classification and decision trees. These trees enable you to present categorical results in an intuitive manner,
so you can more clearly explain categorical analysis to non-technical audiences.
IBM SPSS Decision Trees enables you to explore results and visually determine how your model flows. This helps you find specific subgroups and relationships that you might not
uncover using more traditional statistics. The module includes four established tree-growing algorithms.
Use IBM SPSS Decision Trees if you need to identify groups and sub-groups. Applications include:
● Database marketing
● Market research
● Credit risk scoring
● Program targeting
● Marketing in the public sector
Features & Benefits
Choose from four established tree-growing algorithms and discover hidden relationships in your data.
IBM SPSS Decision Trees provides specialized tree-building techniques for classification - entirely within the IBM SPSS Statistics environment.
It includes four established tree-growing algorithms:
● CHAID - A fast, statistical, multi-way tree algorithm that explores data quickly and efficiently, and builds
segments and profiles with respect to the desired outcome
● Exhaustive CHAID - A modification of CHAID, which examines all possible splits for each predictor
● Classification and regression trees (C&RT) - A complete binary tree algorithm, which partitions data and produces accurate homogeneous subsets
● QUEST - A statistical algorithm that selects variables without bias and builds accurate binary trees quickly and efficiently
With four algorithms, you have the ability to try different types of tree-growing algorithms and find the one that best fits your data.
Because you create classification trees directly within IBM SPSS Statistics, you can conveniently use the results to segment and group cases directly within the data. Additionally, you can generate selection
or classification/prediction rules in the form of IBM SPSS Statistics syntax, SQL statements or simple text (through syntax).
You can display these rules in the Viewer and save them to an external file for later use to make predictions about individual and new cases.
If you'd like to use your results to score other data files, you can write information from the tree model directly to your data or create XML models for use in
IBM SPSS Statistics Server.
IBM® SPSS® Missing Values is used by survey researchers, social scientists, data miners, market researchers and others to validate data.
Missing data can seriously affect your models - and your results. Ignoring missing data, or assuming that excluding missing data is sufficient, risks reaching invalid and insignificant results.
To ensure that you take missing values into account, make IBM SPSS Missing Values part of your data management and preparation.
Uncover Missing Data Patterns
● Easily examine data from several different angles using one of six diagnostic reports,
then estimate summary statistics and impute missing values
● Quickly diagnose serious missing data imputation problems
● Replace missing values with estimates
● Display a snapshot of each type of missing value and any extreme values for each case
● Remove hidden bias by replacing missing values with estimates to include all groups -- even those with poor responsiveness
Features & Benefits
Uncover Missing Data Patterns
With IBM SPSS Missing Values, you can easily examine data from several different angles using one of six diagnostic reports to uncover missing data patterns.
You can then estimate summary statistics and impute missing values through regression or expectation maximization algorithms (EM algorithms).
IBM SPSS Missing Values helps you to:
● Diagnose if you have a serious missing data imputation problem
● Replace missing values with estimates -- for example, impute your missing data with the regression or EM algorithms
Quickly and Easily Diagnose Your Missing Data
Quickly diagnose a serious missing data problem using the data patterns report, which provides a case-by-case overview of your data.
This report helps you determine the extent of missing data; it displays a snapshot of each type of missing value and any extreme values for each case.
Reach More Valid Conclusions
Replace missing values with estimates and increase the chance of receiving statistically significant results. Remove hidden bias from your data by replacing missing
values with estimates to include all groups in your analysis - even those with poor responsiveness.
Use Multiple Imputation to Replace Missing Data Values
IBM SPSS Missing Values' multiple imputation procedure will help you understand patterns of "missingness" in your dataset and enable you to replace missing values with plausible estimates.
It offers a fully automatic imputation mode that chooses the most suitable imputation method based on characteristics of your data, while also allowing you to customize your imputation model.
Several complete datasets are generated (typically, three to five), each with a different set of replacement values. Next, you can model the individual datasets,
using techniques such as linear regression, to produce parameter estimates for each dataset. Then you can obtain final parameter estimates. This involves pooling the individual
sets of parameter estimates obtained in step two and computing inferential statistics that take into account variation within and between imputations.
Analysis of the individual datasets and pooling of the results are supported via existing IBM SPSS Statistics procedures such as REGRESSION.
When operating on datasets with imputed values, existing procedures will automatically produce pooled parameter estimates.
Fill in the Blanks for Improved Data Management
IBM SPSS Missing Values has the statistics you need to fill in missing data:
● Univariate: compute count, mean, standard deviation, and standard error of mean for all cases excluding those containing missing values,
count and percent of missing values, and extreme values for all variables
● Listwise: compute mean, covariance matrix, and correlation matrix for all quantitative variables for cases excluding missing values
● Estimate the means, covariance matrix, and correlation matrix of quantitative variables with missing values, assuming normal distribution,
t distribution with degrees of freedom, or a mixed-normal distribution with any mixture proportion and any standard deviation ratio
● Impute missing data and save the completed data as a file
● Regression algorithm
● Estimate the means, covariance matrix, and correlation matrix of variables set as dependent; set number of predictor variables;
set random elements as normal, t, residuals, or none
IBM SPSS Missing Values also has features that enable you to analyze patterns and manage data, including the ability to:
● Display missing data and extreme cases for all cases and all variables using the data patterns table
● Determine differences between missing and non-missing groups for a related variable with the separate t test table
● Assess how much missing data for one variable relates to the missing data of another variable using the percent
mismatch of patterns table
IBM® SPSS® Neural Networks offers non-linear data modeling procedures that enable you to discover more complex relationships in your data.
Using the procedures in IBM SPSS Neural Networks, you can develop more accurate and effective predictive models.
The result? Deeper insight and better decision making.
What is a neural network?
A computational neural network is a set of non-linear data modeling tools consisting of input and output layers plus one or two hidden layers. The connections between neurons in each layer have associated weights,
which are iteratively adjusted by the training algorithm to minimize error and provide accurate predictions.
Complement traditional statistical techniques
The procedures in IBM SPSS Neural Networks complement the more traditional statistics in IBM SPSS Statistics Base and its modules. Find new associations in
your data with Neural Networks and then confirm their significance with traditional statistical techniques.
Features & Benefits
How can you use IBM SPSS Neural Networks?
You can combine Neural Networks with other statistical procedures to gain clearer insight in a number of areas:
Market research
● Create customer profiles
● Discover customer preferences
Database marketing
● Segment your customer base
● Optimize campaigns
Financial analysis
● Analyze applicants' creditworthiness
● Detect possible fraud
Operational analysis
● Manage cash flow
● Improve logistics planning
Healthcare
● Forecast treatment costs
● Perform medical outcomes analysis
Use data mining techniques
IBM SPSS Neural Networks provides a complementary approach to the data analysis techniques available in IBM SPSS Statistics Base and its modules.
From the familiar IBM SPSS Statistics interface, you can "mine" your data for hidden relationships, using either the Multilayer Perceptron (MLP) or Radial Basis Function (RBF) procedure.
Both of these are supervised learning techniques - that is, they map relationships implied by the data. Both use feed-forward architectures,
meaning that data moves in only one direction, from the input nodes through the hidden layer or layers of nodes to the output nodes.
Your choice of procedure will be influenced by the type of data you have and the level of complexity you seek to uncover. While the MLP procedure can
find more complex relationships, the RBF procedure is generally faster.
With either of these approaches, the procedure operates on a training set of data and then applies that knowledge to the entire dataset, and to any new data.
Control the process from start to finish
After selecting a procedure, you specify the dependent variables, which may be scale, categorical or a combination of the two. You adjust the procedure by choosing how to partition the dataset, what sort of architecture you want and what computation resources will be applied to the analysis.
Finally, you choose whether you want to display results in tables or graphs, save optional temporary variables to the active dataset and/or export models in
XML-based file format to score future data.
IBM® SPSS® Complex Samples helps make more statistically valid inferences by incorporating the sample design into survey analysis.
IBM SPSS Complex Samples provides the specialized planning tools and statistics you need when working with complex sample designs, such as stratified, clustered or multistage sampling.
This module of IBM SPSS Statistics is an indispensable for survey and market researchers, public opinion researchers or social scientists seeking to reach more accurate conclusions when working
with sample survey methodology. You can more accurately work with numerical and categorical outcomes in complex sample designs using two algorithms for analysis and prediction. In addition, you can
use this module's techniques to predict time to an event.
Features & Benefits
Work efficiently and easily with complex sample survey results
Only IBM® SPSS® Complex Samples makes understanding and working with your complex sample survey results easy. Through the intuitive interface,
you can analyze data and interpret results. Choose from one of several wizards to make it easier to create plans, analyze data and interpret results.
When you're finished, you can publish public-use datasets and include your sampling and analysis plans. These plans act as a template and allow you
to save all the decisions made when creating the plan - define it once and you're done. This saves time and improves accuracy for yourself and others who
may want to plug your plans into the data to replicate results or pick up where you left off.
Use the following types of sample design information with IBM SPSS Complex Samples:
● Stratified sampling - Increase the precision of your sample or ensure a representative sample
from key groups by choosing to sample within subgroups of the survey population.
● Clustered sampling- Select clusters, which are groups of sampling units, for your survey. Clustering often helps makes surveys more cost-effective.
● Multistage sampling - Select an initial or first-stage sample based on groups of elements in your population; then create a second-stage sample by drawing
a sub-sample from each selected unit in the first-stage sample. By repeating this option, you can select a higher-stage sample.
Everything You Need for Planning
To help you through the planning stage in the analytical process, IBM SPSS Complex Samples provides you with specialized tools and procedures for working with sample survey data:
● IBM SPSS Complex Samples Plan (CSPLAN) - Use this procedure to specify the sampling frame to create a complex sample design or analysis
specification used by companion procedures in IBM SPSS Complex Samples.
● Sampling Plan Wizard - If you are creating your own samples, use the Sampling Plan Wizard to define the scheme and draw the sample.
● Analysis Preparation Wizard - If you're using public-use datasets that already have samples, use the Analysis Plan Wizard to specify how
the samples were defined and how standard errors should be estimated.
● Plan files - Once you have created plan files, you can save them and treat them as templates. This allows you to save all the decisions you made
when creating the plan. This saves time and improves accuracy for yourself and others who may want to plug your plans into the data to replicate results or pick up where you left off.
Everything You Need for Data Management
IBM SPSS Complex Samples provides what you need for the data management stage when working with sample survey data. And it easily plugs into other IBM SPSS Statistics modules
so you can seamlessly work in the IBM SPSS Statistics environment.
IBM SPSS Complex Samples Selection (CSSELECT) procedure -- Enables you to select complex, probability-based samples from a population while mitigating the risk in
doing so (e.g. over- or under-representing a subgroup). CSSELECT chooses units according to a sample design created through the CSPLAN procedure.
With this procedure, you can:
● Control the scope of execution and specify a seed value with the CRITERIA subcommand
● Control whether or not user-missing values of classification (stratification and clustering)
variables are treated as valid variables with the CLASSMISSING subcommand
● Specify general options concerning input and output files with the DATA subcommand
● Write sampled units to an external file using an option to keep/drop specified variables
● Automatically save first-stage joint inclusion probabilities to an external file when the plan
specifies a probability proportionate to size (PPS) without replacement (WR) sampling method
● Opt to generate text files containing a rule that describes characteristics of selected units
Performing data analysis in IBM SPSS Complex Samples helps you to achieve more statistically valid inferences for populations measured in your complex
sample data. IBM SPSS Complex Samples provides you with better results because, unlike most conventional statistical software, it incorporates the sample
design into survey analysis.
IBM SPSS Complex Samples features five procedures to analyze data from sample survey data:
● IBM SPSS Complex Samples Descriptives (CSDESCRIPTIVES) - Estimates means, sums and ratios, and computes
standard errors, design effects, confidence intervals hypothesis tests for samples drawn by complex methods.
● IBM SPSS Complex Samples Tabulate (CSTABULATE) - Displays one-way frequency tables or two-way
crosstabulations and associated standard errors, design effects, confidence intervals and hypothesis tests for samples drawn by complex sampling methods.
● IBM SPSS Complex Samples General Linear Models (CSGLM) - Enables you to build linear regression, analysis of
variance (ANOVA), and analysis of covariance (ANCOVA) models for samples drawn by complex sampling methods.
● IBM SPSS Complex Samples Logistic Regression (CSLOGISTIC) - Performs binary logistic regression analysis,
as well as multiple logistic regression (MLR) analysis, for samples drawn by complex sampling methods.
● IBM SPSS Complex Samples Cox Regression (CSCOXREG) - Applies Cox proportional hazards regression to analysis
of survival times; that is, the length of time before the occurrence of an event for samples drawn by complex sampling methods.
● IBM SPSS Complex Samples Plan (CSPLAN) - Use this procedure to specify the sampling frame to create a complex
sample design or analysis specification used by companion procedures in IBM SPSS Complex Samples.
Give yourself greater flexibility by extending IBM SPSS Statistics command syntax language by using external programming languages.
Developers and end users can extend the command syntax language, introduce additional statistical functionality and access the IBM SPS Statistics engine
from external applications through the IBM SPSS Statistics Programmability Extension.
Expand IBM SPSS Statistics capabilities through programmability
With the IBM SPSS Statistics Programmability Extension, which is included with IBM SPSS Statistics Base and any of its integrated modules, you can add new
computations and procedures written in your external programming languages such as Python®, R and the .NET version of Microsoft® Visual Basic® .
This enables your organization to:
● Extend IBM SPSS Statistics functionality. Add computations and procedures written in other programming languages.
● Write generalized and more flexible jobs. Create generalized jobs by controlling logic based on the Variable Dictionary,
procedure output (XML or datasets), case data and environment. Reusable code means data is not tied to a single program.
● Handle errors with generated exceptions and easily determine the effectiveness of long syntax jobs
● Utilize hundreds of standard modules for Python
● React to results and metadata
● Build IBM SPSS Statistics functionality into other applications
Since the IBM SPSS Statistics Programmability Extension is already part of your IBM SPSS Statistics software, you can get started quickly.
Just use an IBM SPSS Statistics Programmability Integration Plug-In to take advantage of this advanced programmability functionality.
Getting started
Use the freeware plug-ins that SPSS Inc. has already built for Python, R, and .NET, or follow the instructions
in the IBM SPSS Statistics Programmability Extension Software Developer's Kit (SDK) to build your own. You can
download freeware plug-ins from the SPSS Developer Central Web site. New Programmability Integration Plug-Ins are being
developed by SPSS, an IBM Company, and will be available to download at SPSS Developer Central as soon as they are ready.
SPSS Developer Central
SPSS Developer Central is the online resource for end users and software developers interested in SPSS-related programming and development. From this Web site,
you can download programmability extensions and sample code, access forums and participate in discussions on programmability practices, and read in-depth articles
on SPSS programmability topics.
At SPSS Developer Central, you'll also find many example libraries and syntax jobs for use with the IBM SPSS Statistics-Python
Integration Plug-In. Current Python examples include:
● Functions for simplifying the calls to the IBM SPSS Statistics backend processor for common tasks
● Functions for working with the IBM SPSS Statistics Viewer
● Bootstrap regression
We encourage you to visit SPSS Developer Central regularly, as new information is frequently added.
IBM® SPSS® Direct Marketing helps you understand your customers in greater depth, improve your marketing campaigns and maximize the ROI of your marketing budget.
Conduct sophisticated analyses of your customers or contacts easily - and with a high level of confidence in your results. Choose from recency,
frequency and monetary value (RFM) analysis,
cluster analysis, prospect profiling, postal code analysis, propensity scoring and control package testing. The software's intuitive interface enables you to:
● Identify which customers are likely to respond to specific promotional offers
● Develop a marketing strategy for each customer group
● Compare the effectiveness of direct mail campaigns
● Boost profits and reduce costs by mailing only to those customers most likely to respond
● Prevent spam complaints by monitoring the frequency of e-mails sent to each customer group
● Select potential business locations
● Connect to Salesforce.com to extract customer information, collect details on opportunities and perform analyses
Features & Benefits
Although IBM SPSS Direct Marketing relies on powerful analytics, you don't need to be a statistician or programmer to use it. The intuitive interface guides
you every step of the way, and the new Scoring Wizard makes it easy to build models to score your data. After you run an analysis, the significance of the output is clearly explained.
IBM SPSS Direct Marketing includes a combination of specifically chosen procedures that enable database and direct marketers to conduct data preparation and analysis activities.
You can do this using only IBM SPSS Direct Marketing, or you can use it in conjunction with other applications in the IBM SPSS Statistics product family.
● RFM Analysis: Score customers according to the recency, frequency and monetary value of their purchases.
● Segment customers or contacts: Create "clusters" of those who are like each other, and distinctly different from others.
● Profile customers or contacts: Identify shared characteristics, to improve the targeting of marketing offers and campaigns.
● Identify those who are likely to purchase: Develop propensity scores and improve the focus and timing of your campaigns.
● Test control packages: Find out which new (test) packages out-perform your existing (control) package.
● Know where responses come from: Identify by postal code the responses to your campaigns.
● Integrate response data with Salesforce.com to track leads and report on sales pipeline.