Nested Boxes Algorithm - based on Nested Dolls but fundamentally different?

I've tried to solve a problem related to the SPOJ Nested Dolls problem, where boxes with a two-dimensional bottom are used instead of dolls that differ in a single scale parameter. I have an algorithm but I'm very muddy about the actual theory behind the problem and whether better approaches exist. Can anyone help me understand the problem better and perhaps find a better algorithm?

As a refresher, the Nested Dolls problem as follows:

Given N Matryoshka dolls of a various sizes, find the smallest number of nested dolls that remain after optimally nesting the dolls inside one another. For each nested doll, if the outermost doll has size S then it either contains no dolls, or it contains one nested doll whose size is strictly smaller than S.

I don't know the full specifics of this problem, but from reading about it I believe the Nested Dolls problem can be solved by sorting the dolls by increasing size and repeatedly extracting the Longest Increasing Subsequence (LIS) from the size sequences, where ties are broken by choosing the subsequence that utilizes the biggest dolls. The number of nested dolls would be the number of subsequences extracted. I think this greedy algorithm works because:

a) Reducing the length of one of these subsequences introduces new dolls that cannot reduce the number of nested dolls found in future steps ("fewer is better")

b) Replacing a doll in the subsequence necessarily replaces a smaller doll in the set of remaining dolls with a larger doll, which cannot reduce the number of nested dolls found in future steps ("smaller is better")

This means the problem can be solved in O(N log N) with a good LIS algorithm.

But the boxes problem is different: Given N open boxes with various bottom dimensions, find the smallest number of box-stacks that remain after optimally nesting the boxes inside one another. For each box-stack, if the outermost box has dimension WxH then it either contains no boxes, or it contains one box-stack whose width and height are strictly smaller than W and H respectively.

This means there is no total ordering of the boxes - if box A doesn't fit in box B, it does not imply box B has the same size as A or that it will fit in box A, unlike Matryoshka dolls.

I have no idea if I'm right but I think it's no longer true that the optimal solution can be found by repeatedly extracting the LIS (or rather, the longest sequence of boxes fitting in each other), mainly because there's no good way to break ties. A box with larger area can still end up more useful for future steps if, say, we're comparing between a 1x17 box and a 5x4 box. Trying out all the tied LIS's sounds like exponential runtime. Am I right, or is there really a greedy way to do this?

I've only found one other post about this (Stacking boxes into fewest number of stacks efficiently?) which suggests using a graph theory approach to solve the problem. I have very little experience with graph theory so I have no idea how the approach works. I basically blindly took their word to make a bipartite graph of boxes, asserting that the number of box stacks = (number of boxes - size of the maximum matching). I then implemented the Fork Fulkerson algorithm in Java based on pseudocode without fully understanding how it actually solves the problem. I've tried my best to annotate the code with my thought process, but it irks me how this approach is so different from the Nested Dolls solution, and that it takes 150+ lines when I was challenged to do this in 1 hour. Is it true that there's no easier way to solve the problem?

Code:

import java.util.*;

public class NestedBoxes {
    private static final int SOURCE_INDEX = -1;
    private static final int SINK_INDEX = -2;

    private NestedBoxes() {
        // Unused
    }

    public static void main(String args[] ) throws Exception {
        // Get box dimensions from user input
        Scanner sc = new Scanner(System.in);
        int numBoxes = sc.nextInt();
        List<Rectangle> boxes = new ArrayList<>();
        for (int i = 0; i < numBoxes; i++) {
            Rectangle box = new Rectangle(sc.nextInt(), sc.nextInt());
            boxes.add(box);
        }

        // Sort boxes by bottom area as a useful heuristic
        Collections.sort(boxes, (b1, b2) -> Integer.compare(b1.width * b1.height, b2.width * b2.height));

        // Make a bipartite graph based on which boxes fit into each other, and
        //  add a source linking to all boxes and a sink linked by all boxes.
        // Forward edges go from the left (lower index) nodes to the right (higher index) nodes.
        // Each forward edge has a corresponding backward edge in the bipartite section.
        // Only one of the two edges are active at any point in time.
        Map<Integer, Map<Integer, BooleanVal>> graphEdges = new HashMap<>();
        Map<Integer, BooleanVal> sourceMap = new HashMap<>();
        graphEdges.put(SOURCE_INDEX, sourceMap);
        graphEdges.put(SINK_INDEX, new HashMap<>()); // Empty adjacency list for the sink
        for (int i = 0; i < numBoxes; i++) {
            // TreeMaps make the later DFS step prefer reaching the sink over other nodes, and prefer
            //  putting boxes into the smallest fitting box first, speeding up the search a bit since
            //  log(N) is not that bad compared to a large constant factor.
            graphEdges.put(i, new TreeMap<>());
            // Each node representing a box is duplicated in a bipartite graph, where node[i]
            //  matches with node[numBoxes + i] and represent the same box
            graphEdges.put(numBoxes + i, new TreeMap<>());
        }
        for (int i = 0; i < boxes.size(); i++) {
            // Boolean pointers are used so that backward edges ("flow") and
            //  forward edges ("capacity") are updated in tandem, maintaining that
            //  only one is active at any time.
            sourceMap.put(i, new BooleanPtr(true)); // Source -> Node
            graphEdges.get(numBoxes + i).put(SINK_INDEX, new BooleanPtr(true)); // Node -> Sink
            for (int j = i + 1; j < boxes.size(); j++) {
                if (fitsIn(boxes.get(i), boxes.get(j))) {
                    BooleanVal opening = new BooleanPtr(true);
                    graphEdges.get(i).put(numBoxes + j, opening); // Small box -> Big box
                    graphEdges.get(numBoxes + j).put(i, new Negation(opening)); // Small box <- Big box
                }
            }
        }
        Deque<Integer> path; // Paths are represented as stacks where the top is the first node in the path
        Set<Integer> visited = new HashSet<>(); // Giving the GC a break
        // Each DFS pass takes out the capacity of one edge from the source
        //  and adds a single edge to the bipartite matching generated.
        // The algorithm automatically backtracks if a suboptimal maximal matching is found because
        //  the path would take away edges and add new ones in if necessary.
        // This happens when the path zigzags using N backward edges and (N + 1) forward edges -
        //  removing a backward edge corresponds to removing a connection from the matching, and using extra
        //  forward edges will add new connections to the matching.
        // So if no more DFS passes are possible, then no amount of readjustment will increase the size
        //  of the matching, so the number of passes equals the size of the maximum matching of the bipartite graph.
        int numPasses = 0;
        while ((path = depthFirstSearch(graphEdges, SOURCE_INDEX, SINK_INDEX, visited)) != null) {
            visited.clear();
            Integer current = SOURCE_INDEX;
            path.pop();
            for (Integer node : path) {
                // Take out the edges visited.
                //  Taking away any backward edges automatically adds back the corresponding forward edge,
                //  and similarly removing a forward edge adds back the backward edge.
                graphEdges.get(current).get(node).setBoolValue(false);
                current = node;
            }
            numPasses++;
        }

        // Print out the stacks made from the boxes. Here, deleted forward edges / available backward edges
        //  represent opportunities to nest boxes that have actually been used in the solution.
        System.out.println("Box stacks:");
        visited.clear();
        for (int i = 0; i < numBoxes; i++) {
            Integer current = i;
            if (visited.contains(current)) {
                continue;
            }
            visited.add(current);
            boolean halt = false;
            while (!halt) {
                halt = true;
                System.out.print(boxes.get(current));
                for (Map.Entry<Integer, BooleanVal> entry : graphEdges.get(current).entrySet()) {
                    int neighbor = entry.getKey() - numBoxes;
                    if (!visited.contains(neighbor) && !entry.getValue().getBoolValue()) {
                        System.out.print("->");
                        visited.add(neighbor);
                        current = neighbor;
                        halt = false;
                        break;
                    }
                }
            }
            System.out.println();
        }
        System.out.println();

        // Let a box-stack be a set of any positive number boxes nested into one another, including 1.
        // Beginning with each box-stack being a single box, we can nest them to reduce the box-stack count.
        // Each DFS pass, or edge in the maximal matching, represents a single nesting opportunity that has
        //  been used. Each used opportunity removes one from the number of box-stacks. so the total number
        //  of box-stacks will be the number of boxes minus the number of passes.
        System.out.println("Number of box-stacks: " + (numBoxes - numPasses));
    }

    private static Deque<Integer> depthFirstSearch(Map<Integer, Map<Integer, BooleanVal>> graphEdges,
                                           int source, int sink, Set<Integer> visited) {
        if (source == sink) {
            // Base case where the path visits only one node
            Deque<Integer> result = new ArrayDeque<>();
            result.push(sink);
            return result;
        }

        // Get all the neighbors of the source node
        Map<Integer, BooleanVal> neighbors = graphEdges.get(source);
        for (Map.Entry<Integer, BooleanVal> entry : neighbors.entrySet()) {
            Integer neighbor = entry.getKey();
            if (!visited.contains(neighbor) && entry.getValue().getBoolValue()) {
                // The neighbor hasn't been visited before, and the edge is active so the
                //  DFS attempts to include this edge into the path.
                visited.add(neighbor);
                // Trying to find a path from the neighbor to the sink
                Deque<Integer> path = depthFirstSearch(graphEdges, neighbor, sink, visited);
                if (path != null) {
                    // Adds the source onto the path found
                    path.push(source);
                    return path;
                } else {
                    // Pretend we never visited the neighbor and move on
                    visited.remove(neighbor);
                }
            }
        }
        // No paths were found
        return null;
    }

    // Interface for a mutable boolean value
    private interface BooleanVal {
        boolean getBoolValue();
        void setBoolValue(boolean val);
    }

    // A boolean pointer
    private static class BooleanPtr implements BooleanVal {
        private boolean value;

        public BooleanPtr(boolean value) {
            this.value = value;
        }

        @Override
        public boolean getBoolValue() {
            return value;
        }

        @Override
        public void setBoolValue(boolean value) {
            this.value = value;
        }

        @Override
        public String toString() {
            return "" + value;
        }
    }

    // The negation of a boolean value
    private static class Negation implements BooleanVal {
        private BooleanVal ptr;

        public Negation(BooleanVal ptr) {
            this.ptr = ptr;
        }

        @Override
        public boolean getBoolValue() {
            return !ptr.getBoolValue();
        }

        @Override
        public void setBoolValue(boolean val) {
            ptr.setBoolValue(!val);
        }

        @Override
        public String toString() {
            return "" + getBoolValue();
        }
    }

    // Method to find if a rectangle strictly fits inside another
    private static boolean fitsIn(Rectangle rec1, Rectangle rec2) {
        return rec1.height < rec2.height && rec1.width < rec2.width;
    }

    // A helper class representing a rectangle, or the bottom of a box
    private static class Rectangle {
        public int width, height;

        public Rectangle(int width, int height) {
            this.width = width;
            this.height = height;
        }

        @Override
        public String toString() {
            return String.format("(%d, %d)", width, height);
        }
    }
}


ANSWERS:


Yes, there is a simpler (and a more efficient) solution.

Let's sort the boxes by their width (and in reverse order of their height if the width of two boxes is the same). It is clear that we can nest a box only into a box that goes after it. Thus, we want to split it into multiple increasing subsequences (considering only heights now). There is a theorem that says that the minimum number of increasing subsequences the sequence can be split into is equal to the the length of the longest non-increasing (that is, not strictly decreasning subsequence).

To sum up, the solution goes like this:

  1. Sort the boxes by their width. If the width is the same, compare them by their height in reverse order.

  2. Throw away the widths and just compute the length of the longest non-increasing subsequence of heights (in the order we've got after sorting). It's the answer to the problem. That's it.

It is clear that this solution can work in O(N log N) time if implemented properly.



 MORE:


 ? Nested Boxes Algorithm - based on Nested Dolls but fundamentally different?
 ? Nested Boxes Algorithm - based on Nested Dolls but fundamentally different?
 ? Stacking boxes into fewest number of stacks efficiently?
 ? Algorithm for Determining Variations of Differing Lengths
 ? Graph theory - force based autolayout algorithm
 ? What kind of algorithm is this? Box packing/Knapsack?
 ? What are the main differences between the Knuth-Morris-Pratt and Boyer-Moore search algorithms?
 ? How is dynamic programming different from greedy algorithms?
 ? The box stacking algorithm
 ? How to design strategy and algorithm for gold box game