OK I am about to slam my head through a wall. I am writing this post in the hopes that someone can help me out before I have to resort to — *sigh* — Stack Overflow.

Background

I am working on a community site using NodeJS to build a data access API for an eventual React front end. One of the endpoints I want to provide would ideally return a nested hierarchy of parent-child objects. The depth of children could be 0, or it could be maybe up to 12, and each level of the hierarchy could have any number of siblings.

My gut says go with a recursive function which would return an array of interface-defined data that could be stored in the parent instance. That parent could then be stored as a child of its own parent, and so on up the chain until I get back to the single, original record which started the whole thing.

Problem is, there’s a lot of Promises involved here, and I fear that the async nature of these bastards is behaving badly just outside of my field of vision; I cannot for the life of me figure out how to get recursion to work when execution just plows through without waiting for the 10th level of recursion to finish its work.

Setup

Here’s an example of the data structure:

{
  "id":1,
  "pid": 0,
  "Name": "America"
},
{
  "id":2,
  "pid":1,
  "Name": "New Hampshire"
},
{
  "id":3,
  "pid":2,
  "Name":"Manchester"
}

I’ve represented the data here in JSON, but the actual data is stored in a SQL Server database. ID is the primary key of the record. PID is the associated “parent ID”. In this case, Manchester is a child of New Hampshire, which is a child of America.

The actual SQL Server access code shouldn’t be important because I don’t want to spend all the time breaking down the MSSQL library design just to get to the point that calling a specific function GetStateList(pid) returns an array of JSON objects where pid matches the id argument passed in. This function returns a Promise. Here’s some pseudocode.

async function GetStateList(pid?: string): Promise<StateObject[]>{
   // We set up our inbound params, pid, at this point and associate
   // them with the MSSQL connection. 
   ...
   // Execute the stored proc
   let _results = await ExecuteStoredProcedure('spGetStateList')

   // Translate _results into an array of StateObject interface objects
   let _stateObjects: StateObject[] = TranslateData(_results);

   // Return the array
   return _stateObjects
}

Essentially the function just takes the args, translates them to stored procedure parameters, executes a stored procedure, converts the results to an interface-based object array, and returns the results.

Because I’m using ExpressJS, I’ve got a router set up to handle the path /states/hierarchy. This accepts a single querystring value id which is the ID of the topmost record (in this case, it’ll be 1). Using this ID, I get that top level record (calling another data access function GetState(id) which follows the exact same pattern as GetStateList, but only returns a single StateObject instance) and use this as the first step in the attempted recursion.

The recursive function is called GetChildren(state: StateObject).

The Problem

It’s been a long time since I’ve done recursion, and back when I did it, we didn’t have all of this fancy deferred results stuff. As painful as it would be to lock the UI, it would be a lot simpler if the code just waited for each step to return before doing anything else.

Here’s the gist of the route code and the recursive function as they stand now, which is to say “next to the window with its back towards a very angry developer”.

//Router path which is called with ?id=1
stateRouter.get('/hierarchy/',(req, res)=>{
   try {
      GetState(req.query.id)   //Get a single record with ID = 1
      .then((results) => {    // results is the single record "America"
         GetChildren(results)
         .then(outcome=>{
            GetChildren(results);   //What the hell do I do here?!
         })
      })
   } catch(err) { console.log(err) }
}

async function GetChildren(parent) {
   let _kids = []   // Internal array we use to build results from forEach

   //This is our MSSQL function which async gets a list of items whose
   //PID value = ID of a parent record.
   let _children = await GetStateList(parent.ID);

   //Assuming we have results...
   if (results) {
      _children.forEach(_child=>{
         GetChildren(child)
         .then(result=>{
            _kids.push(result)
         })
      })
   } else {
     //If we DO NOT have results, what happens? Anything?
   }
   parent.Children.push(..._kids)

   return parent
}  

Yes, this is completely wrong on absolutely every level, and developers across all timelines, living and dead, are smacking their heads in frustration, but that is why I am writing this post, not for help on correcting my above completely pseudocode mistakes, but to offer what I hope is at least a logical example of what I am trying to do.

At its most basic, I want to start at any record and, using its id, find its immediate children. For each of those children, I want children, and so on. Each child should be added to an array property in the StateObject interface instance representing its parent, so I end up with this when the original StateObject instance is serialized:

{
  "id":1,
  "pid": 0,
  "Name": "America",
  "Children":[
    {
      "id":2,
      "pid":1,
      "Name": "New Hampshire"

      "Children":[
        {
          "id":3,
          "pid":2,
          "Name":"Manchester",

          "Children":[]
        }
      ]
    }
  ]
}

I have searched around and have found many examples which come close to explaining async recursion, but none of them really follow through completely, especially not when dealing with a data layer and the possibility for several children and several children of children.

If you or someone you love has any insight in to how I could solve this problem, for the love of gawd please let me know. I don’t want to have to resort to overly simplifying the solution by repeating code in a synchronous manner.

1 Comment

  • Hirvox

    September 24, 2022 - 3:03 AM

    There’s two concepts here that can be combined to form the solution.

    First, async functions are really state machines, where any state transitions are triggered by an awaited function either completing successfully or failing.

    What you need to do here is to use the good old for loop instead of foreach to process all of the children. And when you call getChildren recursively, you _await_ that call as well. That way you are assured that you have completely processed the family tree of that particular child before you move on to the next. And when the for loop is complete, you exit the function normally.

    Just using the first concept will probably get you a working solution, but it may fail if you have a lot of children to process. Managing the stack uses up memory, and your runtime environment may impose limits to resource usage. Which is where the second concept comes in.

    Most problems where recursion is an answer can also be solved with a queue or a stack.

    You start by pushing the parent object into a queue. Then enter a loop that attempts to take the first item off the queue and will exit if the queue is empty. Add the dequeued item to your result array. Make your database call with the dequeued item as the parameter. When the awaited database call completes, push the children into the queue. If the queue ever becomes empty, it means that you ran out of children and are done.

Leave a Reply

Your email address will not be published. Required fields are marked *