Thanks for your interest.
My experimental results don't agree with your post.
***However, there is definitely the possibility that my testing is flawed.***
Accordingly, I will post my entire code, including testing.
I will report the results for r = 100 although many other values of r
give similar results.
The vector has a size of 100 million.
It is generated by 100 million calls to std::rand().
The sorting algorithm code and hashtable code will be posted below.
The two algorithms were tested against the same vector.
The longest subsequence with a range of <= 100 had length 310012
The hashing algorithm took 2.4 seconds and the sorting algorithm
took 8.3 seconds.
The IDE was a free version of Visual Studio
// ************************ ALGOS AND TEST ******************************
// Given a sequence expressed as a vector, and given a non-negative integer r:
// Find the maximum length of all subsequences which have a range <= r.
// range is defined to be the maximum - minimum as in statistics.
// This solution should have O(rN)
// Originally, this was in a coding test with r = 1.
#include <iostream>
#include <vector>
#include <unordered_map>
#include <algorithm>
#include <chrono>
#include <cstdlib>
#include <utility>
#include <climits>
int hashLongest(const std::vector<int>& vec, int r)
{
// For each int in the vector, count how many times it occurs.
std::unordered_map<int, int> allCounts;
for (int v : vec)
++allCounts[v];
// For each int k in the vector, count how many values of x occur
// such that x >= k and x <= k + r
std::unordered_map <int, int> leftCounts;
// The maximum value of leftCounts is the result
int Max = 0;
for (auto a : allCounts)
{
for (int i = 0; i <= r; ++i)
if ((long long)a.first + i > INT_MAX)
break;
else // if (allCounts.find(a.first + i) != allCounts.end())
leftCounts[a.first] += allCounts[a.first + i];
Max = std::max(Max, leftCounts[a.first]);
}
return Max;
}
// Previous sorting approach to solve the same problem.
// length is continually incremented.
int sortLongest(std::vector<int>& vec, int r)
{
std::sort(vec.begin(), vec.end());
int length = 0;
for (int behind = 0; behind < vec.size(); ++behind)
{
int ahead = behind + length; // Maximal length uncovered so far.
// After sorting, see if lower point of the maximal length subinterval is within r of the uppermost point
// If so we can extend the length and move to the next point for testing.
while (ahead < vec.size() && (long long)vec[behind] + r >= vec[ahead])
++ahead, ++length;
}
return length;
}
// Randomly generate a testing vector
// Default to a length of ten million.
std::vector<int> generate(int length = 100'000'000)
{
std::vector<int> vec(length);
for (int& i : vec)
i = std::rand();
return vec;
}
// A given vector is tested using either the sorting method
// or the hashing method.
// A bool is true for the hashing method and false otherwise.
// Returned is a pair whose first value is the maximum length
// according to the problem constraints, and whose second value
// is the number of microseconds taken.
std::pair<int, int> longest(std::vector<int>& vec, int r, bool hash)
{
auto start = std::chrono::high_resolution_clock::now();
int maxLength = hash ? hashLongest(vec, r) : sortLongest(vec, r);
auto stop = std::chrono::high_resolution_clock::now();
int time = std::chrono::duration_cast<std::chrono::microseconds>(stop - start).count();
return { maxLength, time };
}
// A display function to display the information from a run
void display(bool hash, std::pair<int, int> info, int length = 10'000'000, int r = 1)
{
std::cout << "Experiment done on a randomly generated vector of length " << length;
std::cout << "\nAn attempt was made to find the maximum length of a subsequence of range " << r;
std::cout << "\nThe method used was " << (hash ? "hashing" : "sorting");
std::cout << "\nThe result is " << info.first << " and the time taken in microseconds is " << info.second << std::endl;
}
// Test both methods.
int main()
{
std::vector<int> vec = generate();
std::pair<int, int> hashResult = longest(vec, 100, true);
display(true, hashResult, vec.size(), 100);
std::pair<int, int> sortResult = longest(vec, 100, false);
display(false, sortResult, vec.size(), 100);
}