By Mark Frieser

At our recent Sync Summit in Los Angeles, one word echoed with near-comic frequency: Metadata. By the second day, a drinking game was proposed where every time a speaker said “metadata”, the audience would take a shot.
Joking aside, music supervisors, coordinators, sync agents and technologists emphasized the importance of accurate metadata at least 50 times in two days – it’s that important to us.
Why? Because good metadata mitigates risk and saves time in the music discovery, evaluation, and clearance process.
The choice we’re faced with as decision makers is simple: when we receive music with bad metadata, we can either spend time attempting to clarify inconsistent, incomplete, missing or doubtful metadata (if we can even find the song in a search in the first place) or move on to a song with accurate metadata.
That’s the hard reality. Music with good metadata gets found and synced, music with bad metadata is lost and forgotten.
Good metadata instantly indicates you’re an industry professional who understands and implements industry best practices. In other words, it shows you’re someone we can work with. Bad metadata instantly indicates the potential for risk and wasted time.
On a practical level, this means if you submit music with bad metadata your music won’t be listened to or licensed.
Based on personal experience and discussions with industry peers, about 80% of the music received, whether from individual artists or major labels, has incomplete or inaccurate metadata and as a result, this music is not considered for their projects.
Of course no one wants to be a part of that 80%, but not everyone is willing or able to do the tedious work necessary to ensure their metadata is correct or even knows how to ensure its accuracy.
I’ve tried to help people employ good metadata practices through classes, events and how-to documents, like my free metadata style guide, but let’s be realistic, this is little more than a drop in an ocean of industry-wide bad metadata.
What’s required is an industry-wide solution that everyone can use to perfect and automate the metadata entry and confirmation process.
And that’s where AI comes in.
But in order for AI to help fix this industry-wide problem, a lot of work needs to be done.
Here are some suggestions to help solve the metadata dilemma:
1. Create Industry-Wide Metadata Standards.
Though most industry experts agree on the baseline metadata required to ensure music is discoverable and clearable, there’s myriad disagreement on how the details are parsed out. To review, this is the basic required metadata set for sync:
- Song Title
- Song Artist
- Album Title (if there’s an album)
- Album Artist
- Composer – this is where writer names and PRO information goes
- Grouping – this is where publisher information goes
- Genre
- Year (of release)
- Track
- BPM
- Comments Section – where you put in contact information, clearability, song feeling, instrumentation, what artist the song sounds like, tempo and other indicative information
- Artwork
- Lyrics
If you have this basic information in your metadata entered correctly in the MP3’s ID3 tags (the fields where you enter metadata in MP3s), it’s ready for sync.
But the devil is in the details.
Some people and systems say you should put metadata details in the composer section that logically belong in the comments section, or they suggest adding extraneous, unneeded information that’s not essential for music discovery and licensing purposes.
Beyond this, terms describing a song can change based on who is filling in the metadata or designing the system. This results in confusing or extraneous information – for example, do you use “RIYL” or “Sounds Like” when referencing a band? And what about other languages? How do you incorporate Japanese, Korean or Spanish metadata? Most importantly, what are the actual terms that music supervisors and decision makers actually search for when looking for music?
Though most agree on what the minimum metadata set is, there’s no agreement on standard descriptive terms or where they go in your ID3 tag (where your metadata lives in an MP3), and this causes inefficiency and confusion.
So how do we fix this? Create an Industry-Wide Metadata Reference.
A board of acknowledged industry professionals including music supervisors, rights holders, artists, technologists, sync agents, producers and other stakeholders should be formed to develop a standardized metadata set defining what information needs to be entered into every ID3 tag field, along with the creation of an official reference guide detailing the search terms most used by music supervisors with the goal to provide the industry with a standard on how metadata should be entered and what search terms should be used to describe music.
2. The 60% Rule: Refining AI’s Descriptive Metadata Output Accuracy
As of this writing, there are a variety of AI-powered systems that analyze a track and provide you with suggested descriptive tagging that you can enter into your metadata comments section.
For the most part, their output is inaccurate. Ballads are listed as uptempo songs, genres are mismatched or confused, instruments are miscategorized and so on.
At best, the accuracy of current AI-assisted music analysis and the resultant metadata output is between 20% – 40%. That’s a fail.
For metadata-savvy users, the output they get is no more than a minimally additive scratch pad that will give them an idea or two, saving neither much time or money.
And for the majority of people who aren’t metadata-savvy, AI-assisted tagging is making metadata worse. Why? Because people will cut and paste the tags they receive into their metadata, correct or not.
This helps no one, because bad descriptions in metadata = bad search results.
For the music supervisor/decision maker, bad search results mean wasted time and energy. For artists and rights holders, bad search results mean lost opportunity. When music doesn’t show up in a search, is not delivered to the right person for the right project or is delivered incorrectly, decision makers get frustrated and rights holders music doesn’t get used.
AI assisted descriptive tagging has incredible promise, and if it were at a minimum of 60% accuracy, it would be at the threshold where it would be of value, additive and helpful enough to allow rights holders to actually save time they can then use to make and promote music. And for the music supervisors, better search results mean less time searching and more time devoted to creative tasks.
So how do we fix this? There Needs to be a “System of Systems”
A system that takes descriptive input from multiple AIs, analyzes the results and then provides a list of tags denoted by accuracy will provide output that meets the 60% directionally correct threshold.
Because, as good as the current music analysis systems are, to some degree they’re all limited by the fact they are just one system, and can only provide output based on how that system is specifically built and taught. On the other hand, if an AI is trained to compare, contrast and analyze results from multiple AIs, it will provide far more accurate output and results than any one system.
3. Connect Publicly Available Ownership Data Systems to Metadata-Entry Systems.
This is vital. We need a way to automate the confirmation, cross-referencing and double checking of writer, ownership, publisher and performance rights organization data, and a method to automate the entry of correct information into metadata ID3 tags.
Current AI systems and methods rely mostly on the rights holders and artists themselves to enter their ownership information into their metadata. So this means the accuracy of ownership metadata is predicated on the ability of the owner to enter that metadata correctly. Some do, some don’t. As a result, a lot of it ends up being inaccurate.
What this means is that music supervisors/decision makers have to not only confirm ownership information with the rights holder, but also need to take time to contact multiple sources to confirm and cross-reference information – from the performance rights organizations to publishers (if available) to make sure that the ownership information they have been provided by the rights holder is accurate.
Any comprehensive AI-automated and assisted metadata solution must be able to cross-reference and confirm ownership with performance rights society databases of writers and publishers in order to ensure accuracy of data for both music supervisors and rights holders.
So how do we fix this? Automated AI-Assisted Metadata Entry Systems Need to be Connected to the Databases of the Performance Rights Societies (PROs).
The performance rights societies like BMI in the USA and SOCAN in Canada, as well as all the others across the world, have the most comprehensive database of publicly available information on writer, publisher and specific PRO information for the artists in their respective rosters. None of these databases have 100% of the information for all the artists in the world as some writers and publishers information exists outside the system, but for the vast majority of compositions, the PRO databases contain the most accurate and largest data on ownership information.
While the solution is simple, getting access to this information can be difficult on a global basis.
In some countries, like in the US, most of this information is made available through the Songview database and would require little more than an API call into the existing database to confirm rights information, while other PROs have little to no parts of their database searchable online.
The good news is that most of the larger PROs have a publicly searchable database online, and if these databases are connected to a metadata entry tool it will allow for quick confirmation of ownership for songs, saving both rights holders and music supervisors/decision makers time, energy and money. This won’t happen overnight, but it’s worth doing because its implementation would benefit the entire music industry.
4. Ensure AI-Assisted Metadata Entry Systems Correctly Enter Metadata into ID3 Tags.
The entry of all this information needs to be automatically entered consistently and in the right way into every ID3 tag of every MP3. Currently this isn’t the case.
The systems people and companies use today to store music and enter metadata are either completely manual, requiring human entry of every metadata element, or some degree of manual and automated metadata entry via custom systems that compile metadata according to the rules of that unique system – rules that are often in opposition to those imposed by the ID3 tagging sections within MP3s.
This causes significant problems. In a closed system people are trained by the system they’re using to enter metadata according to a certain method that works within that closed system, but once the MP3 leaves that closed system the metadata often doesn’t transfer completely, correctly or at all.
In some cases, this is due to the metadata in closed systems not automatically populating into the MP3’s ID3 tag fields and, in other cases, it’s due to the population of metadata into ID3 tag fields not adhering to specific character limitations within the ID3 tag fields, such as the hard 255 character limit of the comments field.
The result is that when the MP3 leaves the system, duly entered metadata within the system is omitted or truncated, stripping vital rights, contact, and descriptive information, rendering it potentially unfindable or unlicensable. Obviously no one wants this to happen, but it does.
So how do we fix this? Every System, Whether Closed or Open, Must Ensure the Most Vital Metadata within the System is Always Entered Automatically into ID3 tags.
This is not a hard fix, but when you use closed systems they make it frustratingly difficult. Part of this may be due to company strategy, due to negligence, but the result is the same: music is sent and received with bad or missing metadata, and this causes inefficiencies throughout the sync ecosystem.
If you are a company with a closed system, make it easier for your users to ensure their metadata transfers from your system into their ID3 tags automatically.
In summation, I believe the only solution to the current metadata mess is the smart development and implementation of technology to better automate the process, and if we can:
- Standardize metadata entry requirements and terminology
- Increase the accuracy of AI descriptive metadata tagging
- Connect publicly available databases for cross referencing
- Automate and standardize metadata entry
This will create a metadata ecosystem that ensures all music has a chance to be discovered, evaluated and licensed – not just the current 20% with good metadata.
