Usability and effectiveness of a formal metadata model as compared to a collaborative, centrally controlled folksonomy, when used in the description and related organization of events in a special purpose repository
Nikolaos Tsatsakis
Computer Science Department, University of Crete, Iraklion, Hellas
INTRODUCTION
During our effort to implement a digital repository for the complete description, categorization/organization, digital preservation and maintenance of the cultural events that are held in the region of Crete, and their subsequent discovery from the public, other creators as well as scholars, we had to take an important decision: “shall we adapt the events’ description process to a formal metadata model for such entities, or instead use few mandatory descriptive elements along with tags that are submitted by the creators and match the audience perception of the events”?
We were aware for the pros and cons of both approaches theoretically, but in order to end up with a usable and effective implementation of the service, we decided to perform an empirical study by use of an online survey with dependent questions, where potential users evaluated the process of content navigation and discovery, when described by each of the above mentioned methods.
Almost two hundred participants form three main user categories (audience, events’ creators and scholars) contributed to our study, and the outcomes were close to those expected: simple description and social tags are welcomed by the audience, that locates events through tags and simple searches, and of course by events’ creators, who find the detailed description workflow process tedious, while complete descriptions based on a formal metadata model are the scholars’ choice.
Conformed to the above results we implemented a metadata assisted events description process in our digital repository by use of adequate elements from a formal model along with tags submitted by events’ creators, which consist a folksonomy of events and are centrally approved by the repository’s curators.
Keywords: events’ repository; formal metadata model; folksonomy; empirical study; dynamic online survey
Parent Project
In a research project with title “Promotion of the tourism and cultural product through the symbolic representation of certain components in a VR360 map: the events’ case study”, which is funded through the Research, Technological Development and Innovation (RDTI) Action “RESEARCH – CREATE – INNOVATE”[1], we have been working on the development of an innovative web mapping service for the cultural events that take place in the region of Crete.
The service can be considered an “Events’ spatiotemporal map”[2] as it depicts on a map (spatially) all events held within the region of Crete in certain time periods (temporally), and is comprised of the following core components:
- A content management system or repository[3] where events’ creators or managers/promoters register and describe events in a step-by-step manner that complies with a formal metadata model (adjusted version) for events (Hage W. et al., 2009). During the deposit process, terms and tags which are mainly selected from centrally updated controlled vocabularies are used.
- A tool used to collect and analyze data of the venues where cultural events take place, in order to support decisions on the creation of VR360 videos of added value for those venues.
- An interactive map interface where events are represented in locations in which they take place by use of markers, that spring apart when clicked. Each separate marker, when clicked, pops-up a card with a brief description of the corresponding event, which in turn is linked to its full description (Figure 1). In the full description, comments and evaluations of the event submitted by certified users are also shown, while events discovery is accomplished by use of faceted search along with clickable tags (Helic et al., 2011) and a related tag cloud representation (Figure 2).
Figure 1. The interactive Events’ map interface with brief and full description.
During the events’ registration process in the corresponding repository, an important question aroused: “Which is the best way to describe events in terms of easiness during the deposit process and of course when considering events’ discovery”? A brief related discussion is presented in the subsection that follows.
Figure 2. Tag Cloud Representation: Size is for the Frequency of Occurrence, Color Represents the Time of Occurrence - Purple for Future, Green for Present and Grey for Past Events.
Metadata: Formal Schemas, Controlled Vocabularies and their Creation and Usage
As the realm of web resources continuously grows and evolves, the use of metadata in the resources organization, description and discovery, correlation and grouping, and their overall utilization, becomes more and more important if not mandatory. Metadata are compared to an investment that, if wisely managed, can deliver a significant return on intellectual capital (Gilliland A.J., 2016).
The three main features of web resources, content, context and structure, have to be adequately reflected through metadata. According to this, metadata is not just “data about data” but value added information frequently governed by community-developed and community-fostered standards and best practices used to ensure quality, consistency, and interoperability (Gilliland A.J., 2016).
In order to describe metadata records (containers), i.e. the metadata structure, we make use of formal schemas, or metadata element sets, while the values that we use to populate the metadata elements come from controlled vocabularies or thesauri that usually follow certain rules and codes in their format and syntax. To give an example, in schema.org[4], which creates, maintains and promotes schemata in a collaborative way through the activity of its community, one can find the Event Type[5], a metadata schema for events which is comprised of all required elements that concern anything that happens at a certain time and location, such as a concert, lecture, or festival (Schema.org, 2022), and is equivalent to the class dcmitype:Event (Dublin Core, Metadata Innovation, 2020). As for a controlled vocabulary, one can refer to the “Library of Congress Name Authority File” as a thesaurus used for names of persons, organizations, events, places, and titles aiming to the identification of these entities and the provision of uniform access to bibliographic resources (Library of Congress, 2022).
Originally metadata creation and maintenance was an exclusive domain of the library community (Zhang, A., & Gourley, D., 2014), a fact that very soon changed, following the diversity of available web resources. Similarly, formal metadata schemas and thesauri are nowadays created collaboratively by all kind of users, rather than by trained information professionals, resulting to more accurate, well adapted schemas as well as social tagging and folksonomies, that come as additional elements to enhance existing formal schemas and collaborative thesauri that reflect the perception and language usage in the web, respectively.
Events’ Description Process in our Project
From the title of our project, one can easily conclude that it mainly targets to the promotion of the tourism and cultural product in the region of Crete, by use of electronic services where events are deposited, organized, preserved and discovered from potential attendants.
Of course, other important objectives have been set throughout all stages in the design and implementation process that are summarized in the questions that follow:
- Do cultural events formulate history? If we record, organize and maintain the presence of cultural events in various locations in time, do we eventually obtain an alternative inventory of some important aspects of a place’s history?
- Do cultural events comprise an important component of the tourism product, and furthermore, is their description, presentation and successful repeatability able to regenerate and redefine tourism?
- Does the spatiotemporal map of events contribute to the touristic and cultural experience by depicting “routes” and creating tourists-travellers, or is it to be considered as unnecessary?
- Are the VR360 videos of the locations (venues) of cultural events as well as of the broader geographic, political and cultural environment capable of providing a sensory and cognitive foretaste of the experience of the events?
- Additionally, do VR360 videos consist an effective means of inspiration for the creation and implementation of novel cultural and other events in a particular place?
- Eventually, is it the case that, a spatiotemporal map of events (“eventful”), boosts investments related to tourism and culture (“investful”)?
To measure the achievement of the above mentioned objectives targeted studies should be conducted, but before that, appropriate content has to be deposited into the “Events’ repository” (mentioned in the previous section) that will be adequately described by appropriate metadata.
As one can easily imagine, the most important portion of the content that has to be created in the repository is related to the metadata themselves: it is them that potential attendants / visitors search in order to locate events of interest and it's the metadata historical archive that will drive scholars to new historical hypotheses. This proves the importance of the description procedure to be adopted during the registration of events in our repository.
In the subsections that follow we present the metadata schema that we have chosen to use from existing formal such schemata and the adaptations performed after certain evaluation processes, along with controlled vocabularies from whom we have, so far, obtained values in certain elements of the model.
From the Core Public Event Vocabulary to our Event Schema
In our effort to identify which metadata objects’ syntax should be applied in our case, in order to best meet the needs of the information creator, repository, and users, we accomplished a comprehensive study of existing broadly used schemas for events. In the following we present a list with URLs that correspond to those that we considered as most appropriate:
- https://semanticweb.cs.vu.nl/2009/11/sem/
- https://developers.google.com/search/docs/data-types/event
- https://schema.org/Event
- https://joinup.ec.europa.eu/solution/core-public-event-vocabulary
- http://linkedevents.org/ontology/
- https://github.com/italia/daf-ontologie-vocabolari-controllati/blob/master/Ontologie/CPEV/v0.4/CPEV-AP_IT.ttl
From the above we ended up to metadata descriptions for events in our repository that are at least comprised of the following properties:
|
Property |
Description |
|
title |
The title property captures the “formal” name given to the event. Titles may be provided in multiple languages with multiple instances of the title property. |
|
description |
This property contains detailed characteristics of the event. Descriptions may be provided in multiple languages with multiple instances of the description property. |
|
url |
This property links to the website of the event. The value of this property is a URL. In case of non-existence, the handle from submission of the event in the repository is provided. |
|
region |
It refers to the geographical name in the form of region-prefecture-municipality/village (values from a controlled vocabulary). |
|
location name |
This property is about a certain toponym i.e. “Koule Fortress” or a brand name of a place i.e. “Cine Studio”, where events are held (values from a controlled vocabulary which is centrally updated). |
|
location description |
A short description of events’ locations with optional reference to its capacity but mandatory to accessibility facilities and to whether it is an open air or covered place. |
|
date |
This property links to two DateTime instances specifying the start and end time of the event. |
|
category |
This property contains the nature or genre of the event. Examples include music, theater, festival, conference, exhibition, city council meeting, and many more (values from a controlled vocabulary which is centrally updated). |
|
creator (single, multiple or group) |
This refers to the main creators of the event. We will try to get values from specific authority files if the existing ones have been adequately updated, or create a controlled vocabulary for the purpose of our project. |
|
creator’s birthdate |
The property is about the date of birth of the main creator and will be potentially used from scholars for sociological studies in the depth of time. |
|
creator’s specialization |
The special category to which a creator belongs to i.e. “Lyricist”, “Bassist”, “Pianist”, etc. It may have multiple values (obtained from a controlled vocabulary which is centrally updated). |
|
is part of |
It is a link to a “broader event” (more general) to whom is a part of, i.e. “The Municipality of Heraklion Summer Festival”. |
|
audience |
This property links to an Audience instance and specifies the group for whom the resource is intended or useful (values from a controlled vocabulary). |
|
language |
The language (values from a controlled vocabulary) of the content or performance or used in an action with optional note if translation/interpretation is provided (in sup/subtitles, via interpreter etc.). |
|
attendance offer |
It refers to the cost of the tickets/attendance and may have multiple values or none, related to: presales offer, normal offer, discount, free or unknown offer. |
|
attendance booking url |
A url to a ticket or seat/attendance booking service. It can be left empty. |
|
implementation sponsor |
This property specifies one or more entities (values from a controlled vocabulary) that financially or otherwise support the implementation of the event. It can be left empty. |
|
communication sponsor |
Entities (values from a controlled vocabulary) that contribute to the promotion/advertisement of the event. It can be left empty. |
|
has organizer |
This property specifies an entity (values from a controlled vocabulary) that organizes or coordinates the event with refer to communication details (implicitly or via the organizer’s website). |
|
has multimedia |
Multimedia documents for the event. It can also refer to url where such material is deposited for access. |
|
has (social) tags |
Tags that are characteristic to the event and do not consist values in any of the above mentioned elements. They are provided by the events’ creators chosen from a controlled vocabulary (folksonomy) or submitted as new entries to be approved and appended to the vocabulary. |
Table 1: Properties of the metadata schema we used in the “Crete Events’ Repository”
As referred to the descriptions of an event’s properties in the above table, most of them take values from controlled vocabularies, either existing ones or especially built during our project implementation. Next we are going to describe those that are of more interest.
Controlled Vocabularies and the Corresponding Elements
Since the very beginning of the testing phase of the Events’ Repository, from all distinct categories of potential users (attendants, creators and scholars), it was apparent that the use of controlled vocabularies when assigning values to metadata elements would be unquestionable. This was not only because of the numerous ways of expressing the value of a property with semantically equivalent terms, but also because in many cases users are aware of idiomatic/idiosyncratic expressions of terms and not the term in its correct (exact) form.
From table 1 we may observe that the use of controlled vocabularies applies in almost half of the elements in the metadata schema we have adopted: region, location name, category, creator, creator’s specialization, audience, language, implementation sponsor, communication sponsor, has organizer and has (social) tags. In fact, they are related to elements that can be used to organize hierarchies, search buckets and straightforward tags of events.
We will briefly discussed those that proved to be of special interest for our project.
Location name
Events’ locations were a considerable entity from the very beginning of our project. They are the venues where events take place, but also those that we decided to promote by capturing them in vr360 videos where narration about their history and technical characteristics:
- provides events’ attendants with a sensory perception of the place they are going to visit in order to participate to the event,
- gives to visitors (tourists) and integrated view of places they plan to visit, and
- enhances the perception and knowledge of the venue to events’ creators that haven’t been visited the place and plan to use it in a future production.
Using a controlled vocabulary for location names was not only a prerequisite so that the included locations would match POIs and Google maps elements and could subsequently pin pointed to our events’ spatiotemporal map (which uses Google maps), but also a necessity that contributes to the interoperability with other mapping/positioning systems and the composition of a constantly updated list o authorities that related to events’ venues for the region of Crete.
The vocabulary is built up by checking locations’ names as they appear in Google maps, as well as other official web sources (information from the Regional Directorate of Crete, libraries and scientific reports and references).
Category
Assigning events to categories is very important in both a faceted search of events from the public and the study of events from scholars perpetually.
Of course an initial list of events’ categories was provided to the repository’s users from the beginning, but as new events come to light, the need of new categories acquires non zero possibility.
Creator
Names of events’ creators are often the best way to advertise events to the public, as lots of events are aimed at fans of their creators. The use of authority files for artists’ names (Library of Congress, 2022) was a good start, but because many others, not internationally known creators, especially local artists, educators, animators etc. submit events to the repository, a controlled vocabulary with creators’ names proved to be a must.
Has (social) tags
Social tagging on repositories has become a trend for many years. It has emerged as one of the best ways of associating metadata with web resources. With the increase in the kinds of web resources becoming available, collaborative tagging for them is also developing vastly, and metadata generated in the form of tags can be efficiently used to improve web search, for web resources classification, for generating ontologies, for enhanced browsing etc. (Gupta M. et al, 2011).
Regarding our Events’ Repository service, it became clear from its very first uses, that social tags should be an extra element in the metadata schema to be used during the events’ deposit procedure. This was mainly because it has proven that mostly events’ creators as well as their fans and followers, especially of younger ages, preferred tags that they created as primary metadata elements, to either describe or search for events. As a result, we provided our service’s users with an extra way of events’ classification and search (via the dynamically generated tag cloud), adding a folksonomy feature (Wikipedia, 2021) apart from the standard taxonomy which was based to the metadata schema we mentioned previously.
The tags are proposed from users and approved for use by curators that are responsible for the final content in our Events’ Repository, making them “abnormal” in a sense, as they finally build up a controlled vocabulary.
Subsequently, a tag cloud for browsing the events is given as a feature in the spatiotemporal events’ map. The cloud follows the representation shown in Figure 2.
Before closing this subsection we have to mention that the update process in all controlled vocabularies that are used in giving values to metadata elements, is the same as for the has (social) tags element.
That is, during the deposit procedure, if one cannot find an appropriate value for a certain element in the provided controlled vocabulary, it completes what considers as most suitable and the value is marked with a “For approval” label. When a curator enters the repository, all values that are to be approved are listed and he/she proceeds with their approval, either as they are or as he/she suggest they should be.
After approval, description of events that had elements in the above condition is considered as completed, and the events are available to be searched and are shown in the Events’ map service. Of course, each new value in the controlled vocabularies, when approved, becomes available for use by all next users of the Events’ repository service.
Evaluation and Conclusions
So far, we have well demonstrated the question regarding “the most effective and easy to use method when describing events in a repository”, that we faced during the implementation of our project.
In order to end up to a conclusion, we conducted a research among potential users, through a web questionnaire, with dependent questions and answers in the form of predefined values in the Likert scale (Fowler Jr, F. J., 2013).
In order to clarify what we mean by dependent questions we give the following example:
If in the question “What sort of user you consider yourself?” a participant answers “Event’s Creator” (the other two values were “Event’s Attendant”, “Event’s Scholar”), then a question “Do you consider the events’ description process, tedious and demanding” will appear in the following.
Our questionnaire was built by use of the EUSurvey service (European Commission, 2022) and participants were invited via a special feature embedded to the service.
In less than ten days we had collected 197 contributions that were analysed by tools that the EUSurvey service provides while for the composite quantitative variables that we produced the Cronbach’s alpha reliability coefficient was computed (Gravetter F.J. & Wallnau L.B. 2017). From the analyses the following conclusions were derived:
- Formal ontologies are mostly welcomed by scholars that want to analyse the contents of the repository in perpetuity, in order to derive information that will aid them in their historical or sociological studies. They are also well accepted and effectively used by older events’ attendants and location visitors, who perform faceted or scenario based searches to locate events of interest.
- For events’ creators, the use of formal ontologies to describe the events they submitted, seemed rather complicated and tedious when all the ontology elements that were defined as non-empty were considered mandatory. In that case, they claimed they prefer them to be set optional, but when they realised that when they left them empty, it was very possible that the events they published were not among the results in related searches they changed their mind. This was not the case when the events were published by events’ promoters, who in their vast majority considered the use of the metadata schema with mandatory elements a must.
- Tags was something that both events’ creators and younger events’ attendants considered as a desired element for the events’ descriptions and searches. Scholars considered them useful in studying the means and language constructs by which people identify and refer to objects in certain periods.
- Last, but not least, detailed descriptions by use of the metadata schema in which social tags are also a basic element, were considered the best choice for most of the users in all three categories, a conclusion that conducted the final implementation of the events’ description procedure in our repository service.
References
Helic, D., Trattner, C., Strohmaier, M., & Andrews, K. (2011). Are Tag Clouds Useful for Navigation? A Network-Theoretic Analysis. International Journal of Social Computing and Cyber-Physical Systems, 1(1), 33-55. https://doi.org/10.1504/IJSCCPS.2011.043603
Gilliland, A. J. (2016) “Setting the Stage.” In Introduction to Metadata, edited by Murtha Baca. 3rd ed. Los Angeles: Getty Publications. http://www.getty.edu/publications/intrometadata/setting-the-stage/
Schema.org (2022). Event: A Schema.org Type. https://schema.org/Event
Dublin Core, Metadata Innovation (2020). DCMI Metadata Terms. https://www.dublincore.org/specifications/dublin-core/dcmi-terms/
Hage van W., Malaisé V., Segers R., Hollink L., Schreiber G. (2009). The Simple Event Model Ontology. https://semanticweb.cs.vu.nl/2009/11/sem/
Library of Congress (2022). Library of Congress Names. http://id.loc.gov/authorities/names
Zhang, A., & Gourley, D. (2014). Creating digital collections: a practical guide. Elsevier.
Gupta, M., Li, R., Yin, Z., Han, J. (2011). “An Overview of Social Tagging and Applications.” In Social Network Data Analytics, edited by Aggarwal, C Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-8462-3_16
Wikipedia (2021). Folksonomy. https://en.wikipedia.org/wiki/Folksonomy
Fowler Jr, F. J. (2013). Survey research methods. Sage publications.
Gravetter F.J. & Wallnau L.B. (2017). Statistics for the behavioral sciences. 10th ed. Boston, MA: Cengage.
European Commission (2022). EUSurvey. https://ec.europa.eu/eusurvey/home/welcome
[1] Single RTDI State Aid Action “RESEARCH – CREATE – INNOVATE”, with the co-financing of Greece and the European Union in the context with Operational Program “Competitiveness, Entrepreneurship and Innovation (EPAnEK)” of the NSRF 2014-2020.
[2] Currently accessed from http://ncrawl.nmlabs.gr/ui/index.action
[3] Currently accessed from http://ncrawl.nmlabs.gr/ui/login
[4] Accessed from https://schema.org/
[5] Accessed from https://schema.org/Event