MongoDB aggregate pipeline to find min and max for unique reporting station id
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I need a MongoDB aggregate pipeline to classify some records as follows:

The database contains:

{"state": , "city": , "date": , "station": , "report_id": .... }

There are numerous reports for each station. As an aggregate search result, I need one object for each station and a minimum and maximum for the date field for that station. The record should be returned with the city and state.

My starting point follows:

  const pipeline = [
      { $match: {"station": arg } },
      { $group : {
           _id : null, 
           minimumDate: { $min: "$date" },
           maximumDate: { $max: "$date" }
        }
      }
  ];
  return records.aggregate(pipeline).toArray();

My question is how to create an array that has "city" and "state" in the array, not just "minimumDate " and "maximumDate"? Do I need to do a second search of can I do it in the same aggregate pipeline? My starting point just does this for one searched station.

I'd like the single search to produce an array entry for each unique station in the database. I'm currently getting the list of stations like this:

exports = function(arg) {
  // Connect to MongoDB Atlas
  const records = context.services.get("mongodb-atlas").db("us").collection("stationdata");

  // We need to use aggregate since .distinct is not supported in Stitch
  const pipeline = [
     { $match: {"state": arg } },
     { $group: {  _id: null, stations: { $addToSet: "$station" } } }
  ];

  // Run the aggregation to get the list of active states
  return records.aggregate(pipeline).toArray().then(c => c[0].stations);
}; 
Can you provide an example of what you're trying to achieve? I mean, a sample dataset and the expected result.
gabrielsimoes 5 months ago
Nevermind, I now understand what you need.
gabrielsimoes 5 months ago
awarded to Wuddrum
Tags
mongodb

Crowdsource coding tasks.

2 Solutions


This is a start, I'll improve it later (shouldn't be using max for station, city and state):

const pipeline = [
    {
        $group: {
            _id: {
                station: {
                    $max: "$station"
                }
            },
            mindate: {
                $min: "$date"
            },
            maxdate: {
                $max: "$date"
            },
            city: {
                $max: "$city"
            },
            state: {
                $max: "$state"
            }
        }
    },
];
That's sort of where I got stuck. I could't get it to work with just city: "$city"
billsouthworth 5 months ago
Winning solution

Wouldn't something like this work?

const pipeline = [
  {
    $match: {
      "station": arg
    }
  },
  {
    $group: {
      _id: null,
      minimumDate: {
        $min: "$date"
      },
      maximumDate: {
        $max: "$date"
      },
      stations: {
        $addToSet: {
          "state": "$state",
          "city": "$city"
        }
      }
    }
  }
]

This would produce the following:

[
  {
    "_id": null,
    "maximumDate": "2018-09-18T16:00:00Z",
    "minimumDate": "2004-05-20T16:00:00Z",
    "stations": [
      {
        "city": "Washington",
        "state": "DC"
      },
      {
        "city": "New York",
        "state": "NY"
      },
      {
        "city": "Los Angeles",
        "state": "CA"
      }
    ]
  }
]

This way, if you have to fetch from some kind of API, you'll save some bandwidth by not attaching minimum and maximum dates to each entry.

This works. Not particularly pretty though.
billsouthworth 5 months ago
View Timeline