this is a longer post, but the gist is short:
- There is a lack of software for content-based video identification.
- I think that AcoustID + MusicBrainz could fill this vacuum.
- Spoiler: I think the audio tracks for videos can and should be used.
That said, I would appreciate if you read on. The rest are my more elaborate thoughts so far on this.
And I very much hope that the idea is found worthwhile to be discussed, that this post creates enough interest in knowledgeable folks to look into this and evaluate the possibilities and obstacles.
Best regards
Björn
The situation
Several video management systems (Plex, Emby, Kodi, Jellyfin, …) exist. They all depend on matching video files with a unique ID (in IMDB, TMDB, …).
This matching process is – gasp – based on filenames and folder structures.
That’s how Plex does it, and that seems to be a de facto standard that most applications at least understand.
There is nothing better out there. At least nothing widespread.
The result …
There are countless threads advising users to rename their collections of movies and series, also to reorganize them.
Users spend endless hours doing that, also taking the time of forum contributors.
Users are restricted from keeping movies and series together (e. g. Star Trek).
There is „helpful“ software out there (e. g. Filebot), but that, too, only uses names to then change filenames.
The matching rate is still nothing worthy of a 21st century software.
When one collection is up to speed, that effort feeds back to the community with exactly 0%. The next user is again all for him-/herself.
… is painful.
How it should work in general:
The content of video files is fingerprinted, then matched against a central db which maps fingerprints with video-IDs.
This video-ID can be used to pull from IMDB, TMDB and the likes.
These metadata can then fill the tags in the file or rename files … content-based and with high precision.
Fingerprint video by their audio tracks
I see benefits to use the audio tracks over the video track and little to no downsides to this approach:
- Audio tracks are just as unique
- Easier on resources
- Proven algorithms and software (AcoustID, MusicBrainz)
- Even silent movies have audio tracks
- Language identification
- Video with multiple tracks help enrich the db: track 1 (English) is known but not track 2 (non-English). Since they belong to the same video file, track 2 can be mapped in the central db to the same video file. From then on a video file that has only the non-English track is also identified.
Some things are already there
Basic structures are already in place with AcoustID plus MusicBrainz:
- acoustic fingerprinting
- reading multimedia file containers
- a central database
- looking up metadata from sources
- writing metadata to multimedia file containers
- more that I am not seeing
- a community for the above!
Still a good bit of work
- use what exists with AcoustID and MusicBrainz, but set up separate handling for video (phase 1)
- accommodate AcoustID to and MusicBrainz to handle movies, e. g. movies are much longer than music (phase 1)
- video files have different containers like mkv (phase 2)
- video files have different audio formats like DTS and AC3 (phase 2)
- meta data must be pulled from new soources like IMDB, TMDB (phase 1)
- more that I am not seeing
The chicken-egg-problem
… can be solved.
There are lots of video collections (both movies and series) out there, that are lovingly maintained.
Video management systems like Plex, Emby, Kodi, Jellyfin, … have engaged communities.
I would expect that enough video fans can be found that would allow their videos to be fingerprinted (by audio) and sent to MusicBrainz (VideoBrainz?) for the initial fill of the central database.