Personal Content Experience (203/381)

Chapter 5: Realizing a Metadata Framework 179 1. Using ﬁ les and in-memory data structures 2. RDF databases 3. SQL databases. Using ordinary ﬁ les and our own ﬁ le formats would be the simplest solution. They would incur little overhead, since there are no database engines or other data management systems involved, apart from those we would develop ourselves for internal purposes only. We could rela- tively easily optimize all data structures to ﬁ t our purposes and for the estimated most common metadata queries, rendering the whole design efﬁ cient. However, the optimization is also the weak spot of the ﬁ le-based solution. The design would be rigid and function as intended only in situations deﬁ ned at the design stage. The dynamic features were lost for good. However, as emphasized throughout this chapter, one of the most important aspects of the metadata management system is accom- modating all kinds of metadata usages, now and in the future. Flexibil- ity is the principal concern for us, effectively ruling out any hand-crafted optimization solutions that restrict the possible use cases, or limit queries to a predeﬁ ned set. Another natural approach would be using RDF databases. Our metadata model is close to RDF, and our internal API level data model is indeed RDF, and what we store in our metadata database are RDF triplets. RDF AND SEMANTIC WEB Resource Description Framework (RDF) is a language originally intended to support knowledge interchange in the Word Wide Web. Lately, it became a good frame- work to model knowledge for many kinds of applications. Usually RDF is used with XML but XML is not the only way to represent RDF knowledge. For instance, even though our framework models semantics of metadata in RDF, we do not use XML syntax internally but only when exporting metadata in separate manifests. In its simplest form, RDF is a triplet, i.e., subject, verb, and object where each part of triplet is identiﬁ ed by URI. Each triplet is a statement about some resource. So with RDF, we can say things like “Author of ‘Alice’s Adventures in Wonderland’ is Lewis Carroll” or “ ‘Alice’s Adventures in Wonderland’ was released 1865”. Everything in RDF is a resource. It is possible to further describe Lewis Carroll by making statements about him. For example, you can make a statement: “Lewis Carroll is a pseudonym for Charles Lutwidge Dodgson” and “Charles Lutwidge Dodgson was born on January 27, 1832.” Together, all RDF triplets form a directed graph where each resource (i.e., subject, object, or anything that can have a URI) is connected to another (with a verb). Here we make a small extension to pure RDF, since we specify that for

Personal Content Experience Page 202 Page 204