Metadata is a complex term with a deceptively straightforward definition.
It’s usually defined as data that describes other data. While technically correct, this barely scratches the surface of what metadata really is and why it matters.
This guide will slice right through the same old, copied and rewritten definitions for a more focused understanding. You’ll learn the different types of metadata, get helpful examples, learn how to use metadata and learn about its potential for the future.
If you’re ready to get a simple, comprehensive definition and explanation of metadata and everything surrounding it, then let’s begin!
What is metadata?
As you know, ‘data about other data’ is a common definition of metadata. There’s a lot more to it than that, but we don’t need to over-complicate the concept, either. Simply put: Metadata is a way to organize and make otherwise inadequate data useful.
Saying that metadata is information about other data is about as helpful as saying an action movie is a type of film, or that a Beethoven song is a musical song by Beethoven. Both of these definitions, like the now overused and common metadata definition, are correct but horribly wrong at the same time.
Just as you might base your selection in renting a film you’ve yet to see off metadata (the title, description, director, stars, etc.), many people rely on this same helpful description to understand, learn from and organize their data.
Let’s say you’ve downloaded a picture, and your goal is to understand its contents completely without looking at the actual image, but instead relying on the metadata. In this particular instance, the title is labeled: Bears. That’s pretty helpful in giving us an idea of what type of picture we’re looking at. Depending on who we are, we may think it’s a photograph of a couple grizzly bears in the snow, or maybe the 1985 NFL team the Chicago Bears. Metadata helps us fill in the gaps.
Next, we see the date: 1508. Now, based on the metadata about our image, we can assume that it’s a picture of a bear that is drawn or painted, because photographs didn’t exist at the time it was completed. Let’s continue.
The next piece of information we’re given is listed as author: Raphael. We can now assume our picture is painted (by Raphael).
Taken together, the metadata tells us: This is a painting by Raphael of three large bears, completed in the year 1508.
We could then organize the picture into different categories (paintings, animals, nature, etc.), which would help us relate it to other images.
As you begin to familiarize yourself with the concept of metadata, it’s important to see plenty of helpful examples to make it easy to understand. Let’s look at some common examples that I’ve selected.
Illuminating metadata examples
You’ll see plenty of examples throughout this guide that demonstrate ways in which metadata is used. However, this section will focus in-depth on specific examples, to show metadata in action.
Now, before going into these examples, you’ll need a brief understanding of what metadata elements are. These are the types of descriptors used for a particular piece of data. They are typically things such as title, author, dates, etc. It becomes clear why having the right elements is so important, as it would be helpful to have a title element for a book or a film, but probably not so much for an amateur photograph.
A movie is a very eye-opening way to show the importance of metadata and here’s why: The data of the film is the contents, i.e., what you see when you watch the movie. The metadata is all of the information on the box or cover. Without metadata, the only way to get a firm grasp on the data would be to watch the movie in its entirety.
For the example above, relevant metadata includes the following:
Title: Harry Potter and the Chamber of Secrets
Cast: Daniel Radcliffe, Emma Watson, Alan Rickman
Release date: Nov 15, 2002
Director: Chris Columbus
I will discuss digital images in an example later, but for now let’s focus on old-school print photographs.
Of course, the image above is a digital representation, but imagine it’s a print you can hold in your hand.
The common photo metadata elements include: author/photographer, camera make/model, location, date.
You’d be able to get a pretty good feel for the above photograph with just a few metadata elements. For example, if the photographer was someone who liked scenes of nature, and the location was given as well as the time, you could make an estimate that the picture in question was taken at sunrise in the mountains.
The assumption with metadata is that at some point someone might need to use it when they’re unable to access the data it describes. There are elements of this assumption in many situations involving paintings. There are also other factors that complicate the process if metadata isn’t available in full.
Let’s break down the different ways paintings are managed using metadata. The first thing that needs to be focused upon is the depth of the information provided. Here is a problem one might encounter with insufficient metadata about a painting:
Title: Night, Starry
Creator: Van Gogh
Type: oil painting
Subject: stars, night
Consider the above metadata and envision the painting it’s describing. Here is the painting which it refers to:
It would be a fair assumption that most had a different painting in mind. This one, perhaps?
Of course, the metadata isn’t inaccurate, but it’s misleading because it doesn’t answer enough qualifying questions. Including the more detailed title (starry night over the rhone) would likely be enough to make things clearer. The location of the painting’s subject matter also would give a firmer basis.
Another important factor with paintings’ metadata is the evolution of technology. Where at one point, ‘oil’ may have sufficiently described a work of art, now things like ‘jpeg’, ‘png’, ‘photocopy’ and other variants become involved in the process. There are times when Starry Night, for example, is in JPEG format, and this might be relevant to the catalog it’s entering. The point of all this isn’t to say metadata is confusing, but to point out that lack of metadata is confusing, especially with paintings.
A book is similar to a film in its metadata, as the content of the book is the data. However, another layer to book metadata comes from tables of contents and other elements inside the book, as well as its outer layer.
Below we have the book ‘Alice’s Adventures in Wonderland’. The first picture is the data, since it’s the content of the book. Just remember that inside the book there is metadata as well.
The below picture, for example, is the inside cover, which is loaded with metadata (title, author, illustrator, publisher).
Finally, the outside of the book or cover (pictured below) has metadata as well (title, author).
Because of library systems, metadata has shown its value when it comes to different types of books. Though our methodology of locating books in libraries changes (now computerized), the usage of metadata has persisted.
Emails are the first example we will touch on where the metadata grows in technological complexity. With an email, there are both obvious and subtle pieces of metadata. The obvious forms are things like date sent, subject, recipient address, sender address and attachments. The metadata that often goes unseen to the average user are things like which server is involved, code, routing information and an SMTP address.
The data of an email, then, is the content of the email, including any attachments.
It’s easy to overlook the metadata of a phone call, since we don’t think twice about making a connection anymore, but there is a decent amount of resulting metadata. Things like the time and date of the call, the caller and recipient, the potential location of the recipient and caller, etc. Other things such as whether the recipient was an individual or a business come into play here as well.
Upon first thought, these seem innocuous and potentially useless, but they can be helpful information in the right circumstances. If nothing else, they can help our phone records tell at least part of a story, shedding light on why a call was made and for what purpose.
Note that the subject matter and what is talked about is part of the data, rather than the metadata. This puts into question anyone’s reliance on metadata to perfectly understand a phone call, since the recipients listed may be incorrect (someone using another’s phone, etc), and the subject matter has to be assumed rather than known.
This is one of the most commonly referred to items when metadata is being discussed, particularly because of the immense availability of numerous metadata within computer files and how easily accessible they are to the average person.
There are too many metadata elements to list when it comes to digital files, and the elements change rather drastically depending on the file type. For this example, we’ll use an image file, pictured below. The image file I used is called ‘animation’. Some basic metadata listed is: type of file (JPG), Size, date created/accessed/modified. These are listed on the ‘General’ tab in Windows.
These are just scratching the surface, as there are many more elements in the ‘Details’ tab, including: size dimensions, resolution, bit depth, camera make and model and more.
The data of the computer file is the contents of the file, in this case the actual pictured contents of the image.
Now that you have some firm examples of metadata, let’s dig into why it’s more valuable than many realize.
Why metadata is more important than you think
Have you heard the saying about finding a needle in a haystack? It’s become cliche, but when you think about it, it would take endless hours and even days to find one.
Now imagine that before you begin sifting through mounds of hay in search of the tiny object, you’re given a metal detector as well as a map which gives the general location where the needle was hidden. Most people will rightly value the metal detector and map highly.
This is metadata in a nutshell. It’s great that the needles (data) exist, but they’re always hidden in a sea of endless data. Without the metal detector (metadata), you wouldn’t really care how valuable the data potentially could be.
And this isn’t to say that metadata’s only use is searching for data, but it does paint an illuminating picture of why, especially in today’s climate of digital data growth, metadata is so crucial.
Without further ado, let’s dig further into what makes metadata so important. The following are a few of the many reasons why metadata needs to be on everyone’s radar.
Expands the usage of specific data
One interesting and underutilized aspect of certain metadata is it has the potential to broaden the potential usage of certain pieces of data. When a piece of information or group of data is specifically dedicated for a distinct function, often it’s the metadata attached that allows it to be used for something else.
For example, the content of numerous computer files might initially only serve those who are willing to sift through all of it and understand the information within. However, other parties could gather further uses for the data by exploring the metadata attached.
Preserves and supports archived data
It’s often common knowledge that metadata helps store and preserve large amounts of information, but what isn’t always known is how drastic this has become, especially with the introduction of digital data.
In the past, physical items were stored using metadata. Now, many of these are digital items stored using metadata. With the constant growth of digital data, the purpose of metadata has grown and propped up certain industries who rely on the successful and efficient storage of information. This helps in all different types of situations, such as pulling legal records or finding personnel information.
Guides people’s situational decisions
It’s common to hear how metadata helps organize and store things. For example, in a large library, metadata helps maintain the books, their order and how easy it is to find them.
It’s true: Metadata is great for finding things when you know exactly what you’re looking for. But because metadata is all about providing context, it can be just as useful if you don’t know what you’re looking for yet – especially if you’re narrowing down options to make a decision.
If, for example, you were selecting a movie to watch on Netflix, you would use the metadata (title, preview image, description, stars/director) to guide your choice, not the actual content of the film (data). So, in a sense, metadata is ultimately responsible for helping us make choices concerning data.
Now that you have some important reasons to understand metadata, let’s examine the different types.
Types of metadata
Descriptive metadata is likely what most people think of when they hear metadata. This type of metadata gives information about items depending on the applicable elements of the object. These elements are ever-changing and vary, but they remain fairly consistent when objects of similar categories are in focus. For example, ‘title’ is a common element in metadata for a book.
Descriptive metadata is quite easy to grasp, as it simply is the information that provides insight into certain items, such as title of a movie, director of the film, etc. Where it can get complicated is when the item being described has rare or unique elements needed to describe it. This is most often the case with things like webpages, as the descriptive metadata needed for these is more code-oriented. However, a pattern will quickly emerge as you begin to see metadata for these types of items as well.
Use metadata is information collected and organized when someone accesses or uses a digital object.
For example, let’s say that Amazon.com wanted to be more involved in the data of their digital sale items. They decide to focus on the movie ‘Friday the 13th’, which they sell as a digital download. By focusing on use metadata, they could find out how many copies were sold/downloaded in the year, what time of day, which day of the week or month, what items were purchased alongside the movie and the buyer’s geographical location.
Now, upon first glance, this seems like information overload, and one might feel this is a poor use of metadata. However, this could not be further from the truth. In fact, in the above example, use metadata has given Amazon a goldmine of information, which can be used to better sell their products in the future. Here’s how.
Firstly and most obviously, how many files were purchased gives a baseline example to reference for comparison of future and previous years. Next, the time of day purchased explains buying habits of film (probably not of specific titles, though).
The use metadata gives Amazon the day of the week the movie is purchased as well. If most of their sales are coming on Friday, it’s a fair statement to say that the day itself (Friday) is reminding people of the movie (Friday the 13th), if only subconsciously. They can also see which month of the year it’s purchased in. If there are more purchases in October, it can be assumed that the film is being remembered by customers during Halloween. As you can see, the sky is the limit with metadata.
So how does this example explain how use metadata is helpful? In the above scenario, it prepares the vendor to make more sales at a certain time of the year or day of the week. Whether this results in them raising or lowering prices is up to them. The fact is they have a better understanding of market habits, at least within their industry. For example, Amazon could raise the prices of ‘Friday the 13th’ in October or on Fridays. They could also recommend it to customers purchasing other scary films.
This type of metadata is collected often with the intention to make inferences about the user involved.
Administrative metadata gives information about data in its entirety – from its creation to its final, or current, state. The result is, as the name suggests, a wealth of information to be used for helping administrate numerous pieces of data.
An easy way to remember administrative metadata is to think of it like this: It is a simplified version of the data. No matter how complex or extensive you believe certain data sets to be, remember that their resulting metadata is much more expansive. This expansion doesn’t result, as it might sound, in higher levels of complexity, but instead results in simplification, as the tangled web of data is clarified immensely.
Now that you are aware of the different types of metadata, here’s a look at some of the things you can do with metadata.
What can you do with metadata?
The question of what can you do with metadata has been indirectly answered throughout this guide, but I’ll touch on some of the most common and helpful things here.
One thing metadata can help you do is find information you need that would otherwise be nearly impossible to acquire. Many acts of national security, the ethics of which we will not dive into here, rely heavily on things like metadata to put together pieces of an incomplete puzzle. It also helps identify different web factors like location which would otherwise be tough to determine.
Fortunately, there are plenty different types of software systems designed to boost your control over metadata, including a metadata editor. These systems allow you to further edit, maintain, organize and automate your metadata processes, which is particularly helpful when dealing with expanding amounts of information on multiple occasions.
Mostly though, you can use metadata in numerous ways that will be helpful to your everyday life. The possibilities are endless and growing, which we will discuss next.
What’s in store for metadata ahead
I will begin by stating the most obvious, yet also one of the most important things to understand about metadata’s future: It will continue to grow in quantity, quality and importance.
As most would expect, this is a result of the clear growth we are seeing and predict to continue to see in data. In fact, saying that data is growing at an exponential rate is now almost an understatement. Furthermore, there is a united effort worldwide to unify the protection efforts for data, building the case for continuing metadata growth.
It’s impossible to predict the future, but by looking at ways new programs and systems are changing and interacting, it becomes clear what to expect. For example, if we predict that online stores and sales will increase, metadata’s role will increase as well. And then there are the many companies who operate online systems which house databases accessible to the public. An online e-book platform, for example, would need its own unique metadata properties to succeed. This would be exclusive to their website. Popular sites already employ metadata practices to better understand their users. How this changes in the future will depend on the way the site grows or changes.
Political aspects and legalities
There’s one thing that I haven’t touched on yet that belongs here as well, and that is the legalities, philosophies and rules surrounding metadata and how these should influence its future use. Where this has caused rifts in the past has been in particular as a result of government entities using metadata for their own purposes, such as national security. We cannot predict the future of this specific subject, but it is safe to say that it will grow stronger in the public’s eye as they become more aware of the power of metadata.
This is because the type of data collection most people will be affected by, at least in their own perception, is any data that is collected about them. The more we shift to using digital services in the future will increase the potential of metadata required for these services to operate optimally. It can be assumed that there will be some sort of breaking point that results in limits, or results in nearly no limits. Whatever ends up happening, expect metadata to be on the forefront of consumers’ minds in the future.
I hope this information gave you a clearer way of looking at this important technical term.
The world of metadata is complex and continues to evolve. However, your understanding of it can grow alongside it. Now that you have some base principles down, any new areas of metadata that sprout up or change will be easy to identify and adapt to.
Lastly, check out this metadata video for more information: