MuleSoft DataWeave: Efficiently Identify Duplicates in Arrays | MuleSoft Guide

MuleSoft DataWeave: Identifying Duplicates & Non-duplicates in an Array

Imagine you have a list of numbers. This DataWeave script efficiently sorts these numbers into two groups: duplicates and non-duplicates. It’s like organizing a collection of balls – grouping identical ones together and separating the unique ones.

Getting Started: The Plan

%dw 2.0output application/jsonimport * from dw::core::Arrays

This line sets the stage:

  • %dw 2.0: We’re using DataWeave version 2.0.
  • output application/json: The output will be in the JSON format for easy readability.
  • import * from dw::core::Arrays: We’ll leverage the powerful array functions provided by DataWeave.

Our Data: The Numbers

var inputData = [3, 1, 5, 4, 2, 1, 5, 6, 7, 1]

This is our input array, representing the numbers we want to analyze.

The DataWeave Magic: Sorting the Numbers

inputData groupBy ($) pluck(if(sizeOf($)>1){"value" : $$,"count" : sizeOf($)   }else{"value" : $$,"count" : sizeOf($)   }) groupBy ( $.count > 1 ) mapObject(if($$ ~= true)("duplicate" : $   )else("non-duplicate" : $   ))

This is the core of the script:

  1. inputData groupBy ($): Groups the input data by its unique values.
  2. pluck(...): Transforms each group into an object containing the “value” and its “count”.
  3. groupBy ( $.count > 1 ): Separates the groups into two categories: those with duplicate values (count > 1) and those with unique values (count <= 1).
  4. mapObject(...): Creates the final output with two keys: “duplicate” for the groups with duplicates and “non-duplicate” for the unique values.

The Result: A Well-Organized List

{"duplicate": [   {"value": "3","count": 2   },   {"value": "4","count": 2   },   {"value": "5","count": 2   } ],"non-duplicate": [   {"value": "1","count": 1   },   {"value": "2","count": 1   },   {"value": "6","count": 1   } ]}

The script effectively identifies and categorizes duplicate values within the input array. It also categorizes non-duplicate values. This provides a clear and concise summary of the data.

]]>
Post a Comment (0)
Previous Post Next Post