Aggregation in MongoDB

Aggregation Operations

Aggregation operations process multiple documents and return computed results. You can use aggregation operations to:

  • Group values from multiple documents together.
  • Perform operations on the grouped data to return a single result.
  • Analyze data changes over time.

Aggregation Pipelines

An aggregation pipeline consists of one or more stages that process documents:

  • Each stage performs an operation on the input documents. For example, a stage can filter documents, group documents, and calculate values.
  • The documents that are output from a stage are passed to the next stage.
  • An aggregation pipeline can return results for groups of documents. For example, return the total, average, maximum, and minimum values.
db.orders.insertMany( [
   { _id: 0, name: "Pepperoni", size: "small", price: 19,
     quantity: 10, date: ISODate( "2021-03-13T08:14:30Z" ) },
   { _id: 1, name: "Pepperoni", size: "medium", price: 20,
     quantity: 20, date : ISODate( "2021-03-13T09:13:24Z" ) },
   { _id: 2, name: "Pepperoni", size: "large", price: 21,
     quantity: 30, date : ISODate( "2021-03-17T09:22:12Z" ) },
   { _id: 3, name: "Cheese", size: "small", price: 12,
     quantity: 15, date : ISODate( "2021-03-13T11:21:39.736Z" ) },
   { _id: 4, name: "Cheese", size: "medium", price: 13,
     quantity:50, date : ISODate( "2022-01-12T21:23:13.331Z" ) },
   { _id: 5, name: "Cheese", size: "large", price: 14,
     quantity: 10, date : ISODate( "2022-01-12T05:08:13Z" ) },
   { _id: 6, name: "Vegan", size: "small", price: 17,
     quantity: 10, date : ISODate( "2021-01-13T05:08:13Z" ) },
   { _id: 7, name: "Vegan", size: "medium", price: 18,
     quantity: 10, date : ISODate( "2021-01-13T05:10:13Z" ) }
] )

Calculate Total Order Quantity

db.orders.aggregate( [

   // Stage 1: Filter pizza order documents by pizza size
   {
      $match: { size: "medium" }
   },

   // Stage 2: Group remaining documents by pizza name and calculate total quantity
   {
      $group: { _id: "$name", totalQuantity: { $sum: "$quantity" } }
   }

] )

Example output:

[
   { _id: 'Cheese', totalQuantity: 50 },
   { _id: 'Vegan', totalQuantity: 10 },
   { _id: 'Pepperoni', totalQuantity: 20 }
]

Calculate Total Order Value and Average Order Quantity

db.orders.aggregate( [

   // Stage 1: Filter pizza order documents by date range
   {
      $match:
      {
         "date": { $gte: new ISODate( "2020-01-30" ), $lt: new ISODate( "2022-01-30" ) }
      }
   },

   // Stage 2: Group remaining documents by date and calculate results
   {
      $group:
      {
         _id: { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
         totalOrderValue: { $sum: { $multiply: [ "$price", "$quantity" ] } },
         averageOrderQuantity: { $avg: "$quantity" }
      }
   },

   // Stage 3: Sort documents by totalOrderValue in descending order
   {
      $sort: { totalOrderValue: -1 }
   }

 ] )

Example output:

[
   { _id: '2022-01-12', totalOrderValue: 790, averageOrderQuantity: 30 },
   { _id: '2021-03-13', totalOrderValue: 770, averageOrderQuantity: 15 },
   { _id: '2021-03-17', totalOrderValue: 630, averageOrderQuantity: 30 },
   { _id: '2021-01-13', totalOrderValue: 350, averageOrderQuantity: 10 }
]

References
https://www.mongodb.com/docs/manual/aggregation/
https://www.mongodb.com/docs/manual/core/aggregation-pipeline/

Selecting MongoDB Fields Using Projection Queries

Queries in MongoDB return all fields in matching documents. To limit the amount of data that MongoDB sends to applications, you can include a projection document to specify or restrict fields to return.

db.inventory.insertMany( [
  { item: "journal", status: "A", size: { h: 14, w: 21, uom: "cm" }, instock: [ { warehouse: "A", qty: 5 } ] },
  { item: "notebook", status: "A",  size: { h: 8.5, w: 11, uom: "in" }, instock: [ { warehouse: "C", qty: 5 } ] },
  { item: "paper", status: "D", size: { h: 8.5, w: 11, uom: "in" }, instock: [ { warehouse: "A", qty: 60 } ] },
  { item: "planner", status: "D", size: { h: 22.85, w: 30, uom: "cm" }, instock: [ { warehouse: "A", qty: 40 } ] },
  { item: "postcard", status: "A", size: { h: 10, w: 15.25, uom: "cm" }, instock: [ { warehouse: "B", qty: 15 }, { warehouse: "C", qty: 35 } ] }
]);

Return All Fields in Matching Documents ( Without Projection )

db.inventory.find( { status: "A" } )

Return the Specified Fields and the _id Field Only

db.inventory.find( { status: "A" }, { item: 1, status: 1 } )

Suppress _id Field

db.inventory.find( { status: "A" }, { item: 1, status: 1, _id: 0 } )

Return All But the Excluded Fields

db.inventory.find( { status: "A" }, { status: 0, instock: 0 } )

Return Specific Fields in Embedded Documents

db.inventory.find(
   { status: "A" },
   { item: 1, status: 1, "size.uom": 1 }
)

Suppress Specific Fields in Embedded Documents

db.inventory.find(
   { status: "A" },
   { "size.uom": 0 }
)

References
https://www.mongodb.com/docs/manual/tutorial/project-fields-from-query-results/

Log All Queries in MongoDB using Database Profiler

Enable logging queries until restart

mongo -u username -p password
use myDb
db.setProfilingLevel(2)

Profiling Status

The current profile level, slowOpThresholdMs setting, and slowOpSampleRate setting.

> db.getProfilingStatus()
{ "was" : 2, "slowms" : 1, "sampleRate" : 1 }

Starting in MongoDB 4.4.2, you can set a filter to control which operations are logged by the profiler.

Enable logging queries permanently

nano /etc/mongod.conf

Log queries take more than 1 mills  to run :

operationProfiling:
   mode: all
   slowOpThresholdMs: 1
   slowOpSampleRate: 1.0

Log queries with filter :

operationProfiling:
   mode: all
   filter: '{ op: "query", millis: { $gt: 2000 } }'

Change Size of system.profile Collection on the Primary

The <database>.system.profile collection stores database profiling information.

To change the size of the system.profile collection on the primary, you must:

  • Disable profiling.
  • Drop the system.profile collection.
  • Create a new system.profile collection.
  • Re-enable profiling.

For example, to create a new system.profile collection that is 4000000 bytes (4 MB), use the following sequence of operations in mongosh:

db.setProfilingLevel(0)

db.system.profile.drop()

db.createCollection( "system.profile", { capped: true, size:4000000 } )

db.setProfilingLevel(1)

References
https://stackoverflow.com/questions/15204341/mongodb-logging-all-queries
https://stackoverflow.com/questions/18510966/why-the-queries-inside-the-system-profile-collection-are-being-overwritten
https://www.mongodb.com/docs/manual/tutorial/manage-the-database-profiler/
https://www.mongodb.com/docs/manual/reference/configuration-options/

SimpleTrigger vs CronTrigger in Quartz Scheduler

SimpleTrigger
SimpleTrigger is used for scenarios in which we need to execute a job at a specific moment in time.

SimpleTrigger trigger = (SimpleTrigger) TriggerBuilder.newTrigger()
  .withIdentity("trigger1", "group1")
  .startAt(myStartTime)
  .forJob("job1", "group1")
  .build();

CronTrigger
The CronTrigger is used when we need schedules based on calendar-like statements.

CronTrigger trigger = TriggerBuilder.newTrigger()
  .withIdentity("trigger3", "group1")
  .withSchedule(CronScheduleBuilder.cronSchedule("0 0/2 8-17 * * ?"))
  .forJob("myJob", "group1")
  .build();

References
https://www.baeldung.com/quartz
https://stackoverflow.com/questions/26403391/difference-between-cron-trigger-and-simple-trigger-in-quartz-scheduler
http://www.quartz-scheduler.org/documentation/quartz-2.3.0/tutorials/crontrigger.html
https://www.freeformatter.com/cron-expression-generator-quartz.html

Volatile memory in Java

To ensure that updates to variables propagate predictably to other threads, we should apply the volatile modifier to those variables:

public class TaskRunner {

    private volatile static int number;
    private volatile static boolean ready;

    // same as before
}

This way, we communicate with runtime and processor to not reorder any instruction involving the volatile variable. Also, processors understand that they should flush any updates to these variables right away.

References
https://www.baeldung.com/java-volatile
https://www.baeldung.com/java-stack-heap

Using Fetch API in JavaScript

Get Method using JavaScript Promise

fetch('http://example.com/movies.json')
  .then(response => response.json())
  .then(data => console.log(data));

Supplying request options

// Example POST method implementation:
async function postData(url = '', data = {}) {
  // Default options are marked with *
  const response = await fetch(url, {
    method: 'POST', // *GET, POST, PUT, DELETE, etc.
    mode: 'cors', // no-cors, *cors, same-origin
    cache: 'no-cache', // *default, no-cache, reload, force-cache, only-if-cached
    credentials: 'same-origin', // include, *same-origin, omit
    headers: {
      'Content-Type': 'application/json'
      // 'Content-Type': 'application/x-www-form-urlencoded',
    },
    redirect: 'follow', // manual, *follow, error
    referrerPolicy: 'no-referrer', // no-referrer, *no-referrer-when-downgrade, origin, origin-when-cross-origin, same-origin, strict-origin, strict-origin-when-cross-origin, unsafe-url
    body: JSON.stringify(data) // body data type must match "Content-Type" header
  });
  return response.json(); // parses JSON response into native JavaScript objects
}

postData('https://example.com/answer', { answer: 42 })
  .then(data => {
    console.log(data); // JSON data parsed by `data.json()` call
  });

Checking that the fetch was successful

fetch('flowers.jpg')
  .then(response => {
    if (!response.ok) {
      throw new Error('Network response was not OK');
    }
    return response.blob();
  })
  .then(myBlob => {
    myImage.src = URL.createObjectURL(myBlob);
  })
  .catch(error => {
    console.error('There has been a problem with your fetch operation:', error);
  });

Supplying your own request object

const myHeaders = new Headers();

const myRequest = new Request('flowers.jpg', {
  method: 'GET',
  headers: myHeaders,
  mode: 'cors',
  cache: 'default',
});

fetch(myRequest)
  .then(response => response.blob())
  .then(myBlob => {
    myImage.src = URL.createObjectURL(myBlob);
  });

Headers

const content = 'Hello World';
const myHeaders = new Headers();
myHeaders.append('Content-Type', 'text/plain');
myHeaders.append('Content-Length', content.length.toString());
myHeaders.append('X-Custom-Header', 'ProcessThisImmediately');
const myHeaders = new Headers({
  'Content-Type': 'text/plain',
  'Content-Length': content.length.toString(),
  'X-Custom-Header': 'ProcessThisImmediately'
});
console.log(myHeaders.has('Content-Type')); // true
console.log(myHeaders.has('Set-Cookie')); // false
myHeaders.set('Content-Type', 'text/html');
myHeaders.append('X-Custom-Header', 'AnotherValue');

console.log(myHeaders.get('Content-Length')); // 11
console.log(myHeaders.get('X-Custom-Header')); // ['ProcessThisImmediately', 'AnotherValue']

myHeaders.delete('X-Custom-Header');
console.log(myHeaders.get('X-Custom-Header')); // null
fetch(myRequest)
  .then(response => {
     const contentType = response.headers.get('content-type');
     if (!contentType || !contentType.includes('application/json')) {
       throw new TypeError("Oops, we haven't got JSON!");
     }
     return response.json();
  })
  .then(data => {
      /* process your data further */
  })
  .catch(error => console.error(error));

References
https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/Using_Fetch