Tech Tips

Find Duplicate Records in IBM SPSS Modeler

To improve your experience using IBM SPSS Modeler, the Version 1 SPSS experts have created various Tech Tips. This Tech Tip shows how to find duplicate records in IBM SPSS Modeler.

IBM SPSS Modeler is an extensive predictive analytics platform designed to bring predictive intelligence to decisions made by individuals, groups, systems, and the enterprise. Modeler has an easy-to-use drag-and-drop user interface with a complete set of tools for accessing data, data examination, preparation, modelling, evaluation, and deployment.

IBM SPSS Modeler users have a complete toolset to build predictive models from start to finish. Modeler uses node-based, visual programming. Users pick nodes from palettes and place them on the stream canvas. Once nodes have been placed on the stream canvas and edited, they can be linked to form a stream. A stream represents a flow of data through several operations (nodes) to a destination that can be in the form of output (either text or chart), a model, or the export of data to another format (e.g., a database).

Identifying duplicate records in IBM SPSS Modeler is critical in the data understanding, examination, and preparation phases of predictive modelling. To identify duplicates, go to the Record Ops palette. Select the Distinct node and drag it onto the stream canvas. You can also double-click the node to drop it onto the stream canvas. Once it is on the canvas, you can connect it to your stream. Double-click to open the node.

On the Distinct node, the options are to create a composite record for each group, include only the first record in each group, and discard only the first record in each group. With ‘Create a composite record’ for each group, you can aggregate non-numeric fields. ‘Include only the first record in each group’ the first record is selected from duplicate records the rest are discarded.

With ‘Discard only the first record’, the first record from the duplicates is discarded, and the remaining records are selected. The ‘Key fields for grouping’ option lists the field or fields used to determine whether records are identical. The Distinct node also has the sort order option that lists the fields used to determine how records are sorted within each group of duplicates and whether they are sorted in ascending or descending order.

Download Slides

Tools Covered

IBM SPSS Modeler

Related Solutions

Training

Tagged As
IBM Modeler Advanced

Need some help?

Book a Training Session

Learn how to use SPSS from the experts

With more than 20 years of delivering highly successful training programs, Version 1 offers a wide range of training options to best suit your requirements, enabling you to optimise your IBM SPSS Software, achieve your analytical goals and continually improve your results.

Find Out More

Related Tech Tips

Our SPSS experts have created a range of Tech Tips for IBM SPSS Modeler. Take a look through.

Tech Tips
View Data

This Tech Tip shows how to use View Data in IBM SPSS Modeler. The View Data node is an easy-to-use tool that allows users to create new charts and explore and understand data.
Read More

Tech Tips
Reclassify Fields

This Tech Tip shows how to reclassify fields in IBM SPSS Modeler, it allows users to create new fields that recode or reclassify existing fields.
Read More

Tech Tips
Append Files

This Tech Tip shows how to append files in IBM SPSS Modeler. 
Read More

Tech Tips
Annotating Nodes in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you annotate nodes in IBM SPSS Modeler.
Read More

Tech Tips
Anonymise Data in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you anonymise data in IBM SPSS Modeler
Read More

Tech Tips
What’s This in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will highlight 'what's this' in IBM SPSS Modeler
Read More

Tech Tips
Build a Quick Bar Chart in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you build a quick bar chart in IBM SPSS Modeler
Read More

Tech Tips
Quickly Audit Data in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you quickly audit your data in IBM SPSS Modeler
Read More

Tech Tips
Quickly Remove Unconnected Nodes in Streams in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you quickly remove unconnected nodes in streams in IBM SPSS Modeler.
Read More

Tech Tips
Disabling Nodes in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you with disabling nodes in IBM SPSS Modeler
Read More

Tech Tips
Setting System Options in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you set system options in IBM SPSS Modeler
Read More

Tech Tips
Encrypt Your Stream in IBM SPSS Modeler

The Version 1 SPSS experts have created various Tech Tips. This Tech Tip shows how to encrypt your stream in IBM SPSS Modeler.
Read More

Tech Tips
Find Recently Used Streams in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you find recently used streams in IBM SPSS Modeler
Read More

Tech Tips
Getting Help While You Work in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you access Help with you work in IBM SPSS Modeler
Read More

Tech Tips
Sorting Records in IBM SPSS Modeler

The Version 1 SPSS experts have created various Tech Tips. This Tech Tip shows sorting records in SPSS Modeler.
Read More

Tech Tips
Getting Version Information in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you get version information in IBM SPSS Modeler.
Read More

Tech Tips
Learn What’s New in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you understand what is new in IBM SPSS Modeler
Read More

Tech Tips
Looking for Relationships in Data in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you look for relationships in data in IBM SPSS Modeler.
Read More

Tech Tips
Order Fields in a File in IBM SPSS Modeler

We have a wide variety of Tech Tips for IBM SPSS created by our SPSS experts. This Tech Tip will help you order fields in a file in IBM SPSS Modeler
Read More

Arrange a free consultation to discuss your analytical needs, and identify the best solution for you.

Enquire Now

Find Duplicate Records in IBM SPSS Modeler

Tools Covered

Related Solutions

Tagged As

Need some help?

Learn how to use SPSS from the experts

Related Tech Tips

View Data

Reclassify Fields

Append Files

Annotating Nodes in IBM SPSS Modeler

Anonymise Data in IBM SPSS Modeler

What’s This in IBM SPSS Modeler

Build a Quick Bar Chart in IBM SPSS Modeler

Quickly Audit Data in IBM SPSS Modeler

Quickly Remove Unconnected Nodes in Streams in IBM SPSS Modeler

Disabling Nodes in IBM SPSS Modeler

Setting System Options in IBM SPSS Modeler

Encrypt Your Stream in IBM SPSS Modeler

Find Recently Used Streams in IBM SPSS Modeler

Getting Help While You Work in IBM SPSS Modeler

Sorting Records in IBM SPSS Modeler

Getting Version Information in IBM SPSS Modeler

Learn What’s New in IBM SPSS Modeler

Looking for Relationships in Data in IBM SPSS Modeler

Order Fields in a File in IBM SPSS Modeler

Arrange a free consultation to discuss your analytical needs, and identify the best solution for you.