Earlier this week, I got a call from a friend with an odd inquiry: “I created a document and uploaded it to my website. Why does the document have a third-party listed in Google Chrome’s integrated PDF viewer?” This caught me off guard, so I took a look at the friend’s site. Sure enough, while the uploaded PDF was visible, Google Chrome displayed an unknown third-party’s name in the top display bar. I downloaded the PDF and took a peak at it in Adobe Acrobat. A quick glance at the file’s metadata elucidated the issue. Both the “title” and “author” entries displayed the unknown third-party’s name. Keep in mind, the content of the PDF was constructed by my friend and not plagiarized. So, I posed a question: “did you create this file from scratch or download it from the web?” Of course, I already knew the answer. She had downloaded the file from the web. The file still contained metadata from the third-party, and Google Chrome used that metadata to display a title. I created a new PDF with metadata that suited her content, reconstructed the content in the new file, and sent it back to her. I instructed her to re-upload the file to fix the issue.
So, what was the issue? Well, a file with a random third-party’s metadata was on my friend’s site for everyone to see. Her constituents would have undoubtedly seen a foreign title and questioned both the accuracy and authenticity of the displayed content. That said, how can this be avoided? For one, constructing content from a fresh file rather than using a preexisting one from a third-party. Second, by understanding that files, as well as all web-related elements, contain metadata.
Simply put, metadata is data about data. It is used to contextualize, tag and augment content for use in databases and websites. Metadata can answer important questions about data including who, what, when, where, why and how. It can also be used for copyright and licensing purposes. An important metadata entry is “rights” because it outlines how content can be used, modified and distributed. In my next post, I will cover more about the “rights” tag, but, for now, keep this in mind: every downloaded file contains metadata. In the digital age, metadata affects all of us. Whether or not we know it, information about millions of sources are housed on our local workstations and PCs. When we communicate with one another through the web, we unconsciously transfer third-party information of which we are not aware. Most of the time, we have no idea what might come to light, like the incident with my friend.