On Monday, 19 November 2018 12:19:51 UTC+2, Paul wrote:
> Below is my solution to finding the longest word in a dictionary given
> a collection of letters. For more (but perhaps still not total) clarity,
> please see my code below. It's an interview question and interviewers
> don't usually give total clarity either.
> I would love to see what people think of my solution. In particular
> is it better to use a vector or a set for the sorted container
> (or doesn't it matter?)
> [BTW, this solution wasn't actually presented at the interview itself,
> so even if it's good it doesn't mean I got the job].
>
> Many thanks.
>
> // Practising the common interview question of finding a longest word in a given dictionary
> // with all characters in a given array of letters. The array may contain duplicate characters.
> // In that case, the number of times that a char, c, can occur = the number of times
> // c is duplicated in the array
> #include <string>
> #include <unordered_set>
> #include <vector>
> #include <iostream>
> #include <algorithm>
> #include <set>
>
> // First step is to acquire the character counts in the array of letters.
> std::vector<int> charCounts(const std::string& arrayOfLetters)
> {
> std::vector<int>counts(256);
That is pessimization of std::array<int,256> into std::vector<int>.
> for(char c : arrayOfLetters)
> ++counts[c];
>
> return counts;
> }
>
> std::string LongestWord(const std::string& arrayOfLetters, const std::unordered_set<std::string>& Dictionary)
> {
> // Preprocessing of array of letters
> const std::vector<int>& characterCounts = charCounts(arrayOfLetters);
> // Test the longest words first
> auto cmp = [](const std::string& word1, const std::string& word2)->bool{return word1.size() > word2.size();};
There is point to sort outside when several arrays of letters need
to be validated for same dictionary. It is likely wasteful to sort
for one pass search.
> const std::set<std::string, decltype(cmp)> sortedDictionary(Dictionary.begin(), Dictionary.end(), cmp);
That std::set destroys your algorithm when there are multiple entries
with same length in dictionary. Your test data below does not have
multiple entries with same length but generally it is the case.
Some std::multiset would work.
> // A vector solution would have been as below. Should I have done this?
> /*std::vector<std::string> sortedDictionary(Dictionary.begin(), Dictionary.end());
> std::sort(sortedDictionary.begin(), sortedDictionary.end(), cmp);*/
In sorted dictionary you should first skip "too long" entries
with std::upper_bound.
> // Sorted by length so we only need to find the first word that works.
> for(const std::string& w : sortedDictionary)
> {
> std::vector<int> charPositions(256);
> bool wordWorks = true;
> for(char c : w)
> if(++charPositions[c] > characterCounts[c])
> {
> wordWorks = false;
> break;
> }
I did not understand what you meant by "using continue" in
other post? The break makes perfect sense here.
> if(wordWorks)
> return w;
> }
>
> std::cout << "\No words in the dictionary satisfied the conditions";
> return "";
> }
>
> // Testing code -- "acce" is returned which is correct.
> // Needs more tests, though.
> int main()
> {
> std::unordered_set<std::string> words = {"acceca", "ace", "acce", "accce" };
> std::cout << LongestWord("acecad", words);
> }
Download some words from somewhere
stuff:
https://github.com/dwyl/english-words
Then you can use subsets of those for whatever tests.
If you try to maximize performance of your program then
you may discover that there are more efficient algorithms
than std::sort for sorting dictionary with lot of entries with
equal length by length.
These are likely not good idea to write in answer for a test
assignment.