When using the Mulesoft MongoDB connector Execute command using the Mongo Aggregation command while a pipeline has a blank value so it looks like [], the full result set is not being returned.
You may experience inconsistencies when trying to retrieve large datasets using the ExecuteCommand operation with data similar to the following:
|
output application/json --- { aggregate: ‘myCollection’, pipeline: [], cursor: { batchSize: 1000000 } } |
output application/json --- { aggregate: ‘myCollection’’, pipeline: [ { "$project": { "awesomeColumn": { "_id": 1 } } }], cursor: { batchSize: 1000000 } } |
|
n results |
y results (where y < n) |
MongoDB has a series of limitations when trying to manage large datasets and this information is helpful with the MuleSoft connector.
Result Size restrictions: Each document in the result set is subject to the 16 megabyte BSON Document Size limit.
Numbers of Stages Restrictions: MongoDB 5.0 limits the number of aggregation pipeline stages allowed in a single pipeline to 1000.
Memory restrictions: Each individual pipeline stage has a limit of 100 megabytes of RAM.
Some of these restrictions can be configured on the MongoDB instance (that is, from the server-side, outside of the connector scope). The MongoDB Aggregation Pipeline Limits article can be consulted for more information about this.
When any of these limits are reached:
MongoDB will automatically paginate the requests to avoid performance issues.
The batchSize will be ignored by the database if needed.
There will be a different amount of results depending on the data that has to be retrieved for each row (more data => fewer results per batch).
In MuleSoft, when experiencing inconsistencies when trying to retrieve large datasets using the ExecuteCommand operation, user-side pagination will need to be completed:
The ExecuteCommand operation returns a cursor as a result that will allow the user to retrieve more information from the database. A common paginated result can look something like this:
{
"cursor": {
"firstBatch": [
{
/... Data .../
},
...
],
"id": 5831430585665131451,
"ns": "myDatabase.myCollection"
},
"ok": 1.0
}
When the cursor.id has a value different than 0, it means that there are more results. The user can retrieve more results using the getMore command like this:
{
"getMore": 5831430585665131451, <= cursor.id received as result from the first request
"collection": "myCollection"
}
The result will look like this:
{
"cursor": {
"nextBatch": [
{
/... Data .../
},
...
],
"id": 0,
"ns": "myDatabase.myCollection"
},
"ok": 1.0
}
The value of the cursor.id can be either 0 or the same id used for the request. This id will appear repeated on every request until there are no more batches to retrieve.
001122165

We use three kinds of cookies on our websites: required, functional, and advertising. You can choose whether functional and advertising cookies apply. Click on the different cookie categories to find out more about each category and to change the default settings.
Privacy Statement
Required cookies are necessary for basic website functionality. Some examples include: session cookies needed to transmit the website, authentication cookies, and security cookies.
Functional cookies enhance functions, performance, and services on the website. Some examples include: cookies used to analyze site traffic, cookies used for market research, and cookies used to display advertising that is not directed to a particular individual.
Advertising cookies track activity across websites in order to understand a viewer’s interests, and direct them specific marketing. Some examples include: cookies used for remarketing, or interest-based advertising.