MongoDB $split Operator

The word split refers to the division of of a specific field or splitting of one string into two or more substrings. The split operator of MongoDB works to create substrings for a particular field of any collection that resides in the MongoDB database. Just like we use the concept of a subset that belongs to a particular set in Mathematics, the substrings created by the split operator belong to one specific field record. In this MongoDB guide, we are going to discuss the use of the split operator of MongoDB to split a value of a specific field to two or more substrings using delimiter.

Example 01:

To get started with MongoDB, we should set up a new schema in its database. In MongoDB, the schema can be generated using collections. Therefore, you should have MongoDB compass and MongoDB shell configured at your end. Firstly, you need a collection in your database that will be utilized for the application of the split operator. Therefore, we have been creating one “Data” with the function “createCollection” by utilizing the collection “Data”.

test> db.createCollection("Data")

{ ok: 1 }

We start the first illustration with the insertion of records into a collection of “Data”. Therefore, MongoDb’s insertMany() function has been used here to insert 5 records in the collection. Each of these records has 3 fields: id, Name, and AdmDate. The output acknowledgment shows that the records are inserted successfully within the Data collection.

test> db.Data.insertMany([ {id: 1, Name: "Joe", AdmDate: "Dec-22-2020"},

... {id: 2, Name: "Peter", AdmDate: "Nov-14-2021"},

... {id: 3, Name: "Nina", AdmDate: "Nov-14-2018"},

... {id: 4, Name: "Misha", AdmDate: "Jan-14-2022"},

... {id: 5, Name: "Elen", AdmDate: "Sep-4-2021"} ])

{ acknowledged: true,

insertedIds: {

'0': ObjectId("63bd2d8e01632fd3c02ab8d3"),

'1': ObjectId("63bd2d8e01632fd3c02ab8d4"),

'2': ObjectId("63bd2d8e01632fd3c02ab8d5"),

'3': ObjectId("63bd2d8e01632fd3c02ab8d6"),

'4': ObjectId("63bd2d8e01632fd3c02ab8d7")

} }

Now that the records in the Data collection are inserted in the form of documents, we will display them in the MongoDB console in a JSON format. Thus, the find() function would be of great help here. The use of the find() function along with the forEach() function taking “printjson” as an argument displays the records as the output demonstrates.

test> db.Data.find().forEach(printjson)

{ _id: ObjectId("63bd2d8e01632fd3c02ab8d3"), id: 1, Name: 'Joe', AdmDate: 'Dec-22-2020' }

{ _id: ObjectId("63bd2d8e01632fd3c02ab8d4"), id: 2, Name: 'Peter', AdmDate: 'Nov-14-2021' }

{ _id: ObjectId("63bd2d8e01632fd3c02ab8d5"), id: 3, Name: 'Nina', AdmDate: 'Nov-14-2018' }

{ _id: ObjectId("63bd2d8e01632fd3c02ab8d6"), id: 4, Name: 'Misha', AdmDate: 'Jan-14-2022' }

{ _id: ObjectId("63bd2d8e01632fd3c02ab8d7"), id: 5, Name: 'Elen', AdmDate: 'Sep-4-2021' }

In this illustration, we have an “AdmDate” field that contains the “-“ character between month, day, and year. We will be using the “-“ character to split the AdmDate field into substrings. To use the split operator, we should be casting off the aggregate function of MongoDB along with the collection name which is “Data”. This function starts with the “$match” operator that is used here to specify the record using one of its fields: id:2 specifies the record 2.

After this, we are casting off the project operator to utilize the split operator on the “AdmDate” field of the Data collection to split the field into 3 substrings taking the “-“ character as the decimeter. The “Name” field will be displayed as it is while the field AdmDate field will be replaced with a new title “Date”. The output of this instruction’s execution shows the Name field of the 2^nd record as it is. The AdmDate field’s title is updated as “Date” and its value has been splitted into three substrings via the help of a delimiter “-“ and displayed in an array.

test> db.Data.aggregate([ {$match: {id: 2}}, {$project: {Name: 1, Date: {$split: ["$AdmDate", "-"]}}} ])

[ { _id: ObjectId("63bd2d8e01632fd3c02ab8d4"), Name: 'Peter', Date: [ 'Nov', '14', '2021' ] } ]

The use of the split operator only updates the runtime result without affecting the actual record in the database collection. The illustration of this concept has been displayed with the find() function instruction and its output for the Data collection in the attached code snippet. “AdmDate” field has been the same before the use of a split operator.

test> db.Data.find({id:2})

[ { _id: ObjectId("63bd2d8e01632fd3c02ab8d4"), id: 2, Name: 'Peter', AdmDate: 'Nov-14-2021' } ]

Example 02:

In the example above, we have seen how a split operator can be used to split a field record into 2 or more substrings without updating the original records. To add an updated field (contains substrings) within the collection, we should be casting the merge operator along with the split operator. Make sure to separate the merge operator (applied on the “Data” collection) from the split operator (applied on the AdmDate field). This query returns nothing in return as it updated the original records of the collection “Data” without displaying anything on the MongoDB shell.

test> db.Data.aggregate([ {$project: { DateInfo: {$split: ["$AdmDate", "-"]}}},

... { $merge: "Data" } ])

Using the merge operator in our above instruction to spit the field AdmDate, we have the below output.

test> db.Data.find({})

[{_id: ObjectId("63bd2d8e01632fd3c02ab8d3"), id: 1, Name: 'Joe', AdmDate: 'Dec-22-2020', DateInfo: [ 'Dec', '22', '2020' ] },

{_id: ObjectId("63bd2d8e01632fd3c02ab8d4"), id: 2, Name: 'Peter', AdmDate: 'Nov-14-2021', DateInfo: [ 'Nov', '14', '2021' ] },

{_id: ObjectId("63bd2d8e01632fd3c02ab8d5"), id: 3, Name: 'Nina', AdmDate: 'Nov-14-2018', DateInfo: [ 'Nov', '14', '2018' ] },

{_id: ObjectId("63bd2d8e01632fd3c02ab8d6"), id: 4, Name: 'Misha', AdmDate: 'Jan-14-2022', DateInfo: [ 'Jan', '14', '2022' ] },

{_id: ObjectId("63bd2d8e01632fd3c02ab8d7"), id: 5, Name: 'Elen', AdmDate: 'Sep-4-2021', DateInfo: [ 'Sep', '4', '2021' ] }]

Example 03:

Let us create a new collection named “Person” in the same database to use for our new illustration. The createCollection() will be cast off with the name of the collection in its parameter.

test> db.createCollection("Person")

{ ok: 1 }

Now that the collection is created and is empty, we have to add documents to it. As we are going to add more than 1 record in the “Person” collection, we should be using the insertMany() function here. The 5 records needed to be added and each one contains 2 fields: id and Name. We will be using the Name field for the split operator to split its string into substrings.

test> db.Person.insertMany([{id:1, Name: "Lia Asif"}, {id:2, Name: "Joly Woe"}, {id: 3, Name: "Eden Robe"}, {id:4, Name: "William Robert Patinson"}, {id:5, Name: "Justin P Trudo"}])

{ acknowledged: true,

insertedIds: {

'0': ObjectId("63bd33e401632fd3c02ab8e1"),

'1': ObjectId("63bd33e401632fd3c02ab8e2"),

'2': ObjectId("63bd33e401632fd3c02ab8e3"),

'3': ObjectId("63bd33e401632fd3c02ab8e4"),

'4': ObjectId("63bd33e401632fd3c02ab8e5")

} }

After these 5 records are successfully added to the “Person” collection via the insertMany() function of MongoDB, we can finally display them in a sequence with a single “find” function query as beneath. The output of this simple instruction shows 5 records of the Person collection with unique ids.

test> db.Person.find({})

[ { _id: ObjectId("63bd33e401632fd3c02ab8e1"), id: 1, Name: 'Lia Asif' },

{ _id: ObjectId("63bd33e401632fd3c02ab8e2"), id: 2, Name: 'Joly Woe' },

{ _id: ObjectId("63bd33e401632fd3c02ab8e3"), id: 3, Name: 'Eden Robe' },

{ _id: ObjectId("63bd33e401632fd3c02ab8e4"), id: 4, Name: 'William Robert Patinson' },

{ _id: ObjectId("63bd33e401632fd3c02ab8e5"), id: 5, Name: 'Justin P Treudo' } ]

As we have mentioned before, we will be utilizing the Name field to understand the split operator more. Therefore, here comes the aggregate function once again along with the project operator in it. This time, we are going to split the Name field with the single space in between the substrings of its values. The title of the field “Name” will be replaced by the “Title”. After setting this query, we have been running it in the MongoDB shell and it shows the splitted values of the Name field in an array “Title” for all records i.e. a single name has been splitted into 2 or more substrings.

test> db.Person.aggregate([{ $project: { Title: { $split: ["$Name", " "] } } }])

[ { _id: ObjectId("63bd33e401632fd3c02ab8e1"), Title: [ 'Lia', 'Asif' ] },

{ _id: ObjectId("63bd33e401632fd3c02ab8e2"), Title: [ 'Joly', 'Woe' ] },

{ _id: ObjectId("63bd33e401632fd3c02ab8e3"), Title: [ 'Eden', 'Robe' ] },

{ _id: ObjectId("63bd33e401632fd3c02ab8e4"), Title: [ 'William', 'Robert', 'Patinson' ] },

{ _id: ObjectId("63bd33e401632fd3c02ab8e5"), Title: [ 'Justin', 'P', 'Trudo' ] } ]

Conclusion

This MongoDB guide discusses the split operator usage in MongoDB while making its comparison with the concept of subsets in Mathematics. To support our explanation, we have discussed 2-3 illustrations of MongoDB in the form of code snippets. These code examples illustrate how a split operator can be utilized to split the values of a particular field into substrings at run time and for permanent change in the database collection.

MongoDB $split Operator

Conclusion

About the author

Saeed Raza