Codes and rules help keep your music library organized to achieve a balanced variety in your playlists. General playlist codes include artist separation, song rotation, tempo, and music style. But what about codes and rules for your library itself? Below are three simple rules to consistently code your music library.
Maintain Consistent Category Counts
There are typically three to five categories. For example, a contemporary station library might have five: A - Hot Currents, B - Medium Currents, C - New Currents, D - Recurrents, and E - Gold/Oldies.
Keeping the category counts consistent is important. If you want to move songs around in your library (moving them up the ladder, dropping them from the library, or adding something completely new), you need to do a balancing act of the categories to keep them fairly consistent. Rule of thumb: when adding a song to your library, take another one out.
So, if you add a new song to your rotation (usually starting in the C category), you'll need to remove another song from the C category. Are there any songs that aren't worth keeping in your library? If so, your job is easy. Remove that song, add in the new one, and call it a day.
However, if all the songs in the C category are good, you'll need to figure out which one should move up the ladder into the B category, in which case you'll need to figure out what song should be removed from B (either dropping it completely or moving something up to A, and so on).
Only Keep the Big Hits in the Recurrent Category
Recurrent hits are important to maintain your current audience. Songs in this category should play at least a quarter as often as those in the Hot Currents category, but make sure the rotation stays consistent. Don't let this category get so big that it slows down the rest of the categories!
In order to keep the category counts consistent, it's important to make sure that only the true hits move from one of the Current categories into the Recurrent category. If it's not a true hit, remove it completely from your library. Just because a song was once a Hot Current doesn't automatically mean it should move to Recurrents when it's past its prime.
The most common mistake is the Recurrents category having too many non-hits. If this category is full of weak or marginal songs, the category is weak. Ultimately, even one weak category can then make for a weak playlist.
Don't Get Too Far Ahead
Charts, like the Billboard Hot 100 or charts on the streaming services, are generally good indicators of the latest and greatest music trends. But, they move pretty quickly. Just because a song has dropped on a chart doesn't mean you should rush to move it out of heavy rotation! You don't want to drop a song just before it reaches its peak.
Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data.
There are some other libraries to do nearest neighbor search. Annoy is almost as fast as the fastest libraries, (see below), but there is actually another feature that really sets Annoy apart: it has the ability to use static files as indexes. In particular, this means you can share index across processes. Annoy also decouples creating indexes from loading them, so you can pass around indexes as files and map them into memory quickly. Another nice thing of Annoy is that it tries to minimize memory footprint so the indexes are quite small.
We use it at Spotify for music recommendations. After running matrix factorization algorithms, every user/item can be represented as a vector in f-dimensional space. This library helps us search for similar users/items. We have many millions of tracks in a high-dimensional space, so memory usage is a prime concern.
You can also accept slower search times in favour of reduced loading times, memory usage, and disk IO. On supported platforms the index is prefaulted during load and save, causing the file to be pre-emptively read from disk into memory. If you set prefault to False, pages of the mmapped index are instead read from disk and cached in memory on-demand, as necessary for a search to complete. This can significantly increase early search times but may be better suited for systems with low memory compared to index size, when few queries are executed against a loaded index, and/or when large areas of the index are unlikely to be relevant to search queries.
Using random projections and by building up a tree. At every intermediate node in the tree, a random hyperplane is chosen, which divides the space into two subspaces. This hyperplane is chosen by sampling two points from the subset and taking the hyperplane equidistant from them.
c01484d022