As another analyzer feature, Petapator could generate (or somehow help a user to identify) one or more subsets of U.S. Classifications (or truncated classification terms) that captures all or some threshold percentage of the patents returned by the user search.
The subset(s) could be selected/determined based on the smallest number of classifications required to capture all results and/or the smallest sample of patents contained in the union of the classification groups (or largest, or a range of sizes). The idea would be to produce one or more suggested subsets of classifications that could be useful in a refined follow-up search in which the user would take out one or more other restrictions (typically a keyword restriction). In my own past searching, I have used intuition/trial-and-error to choose a sample of classifications for follow-up searches. The Petapator bar chart will already help me do that more efficiently, but it could be even better with some more automated analytics.
Possible specific implementation - with the classifications sorted by frequency, split the bar for each classifcation below the highest-frequency classification into two colors, one color representing the number of results that overlap with one or more of the higher-frequency classifications listed above, and another color representing the number of results that do not overlap.
A simpler implementation - give frequency statistics for parent classifications and not just for the narrower subclassifications. This would give a user an idea of how segregated or integrated the results are across subclassifcations within a given parent classification.
I would love to hear others' thoughts on this as well as the Petapator developers' thoughts.