Vocabulary Matching Tool - Help
Table of Contents
The Vocabulary Matching Tool builds on the tool originally developed during the first phase of the ARIADNE Project. It is intended to aid in the creation of mappings from locally used terms/concepts to the Getty Art & Architecture Thesaurus (AAT). The aim of the mapping exercise is to identify subject mappings from source terms/concepts to AAT concepts that are likely to be useful to assist subsequent browsing and searching of the data. The creation of mappings to a common spine vocabulary will enable improved opportunities for multilingual subject access and cross search in the finished system, by aggregating mappings from multiple data partners. The application presents an editable table of currently derived matches with a direct AAT lookup facility to make more informed mapping decisions. The set of mappings created may be exported to JSON or delimited text (CSV) format for use in other applications.
This tool requires a fairly modern web browser to function correctly. It has been tested on Google Chrome Version 74.0.3729.131 (64-bit) and on Safari 12.1. Unfortunately it does not yet work on Microsoft Internet Explorer or Edge browser.
User Interface Language
The tool has been developed with a multilingual user interface. Selecting the required UI language from the drop-down list will change the language of the title, table headers, button labels and footer description. The data within the table however will not be changed. Please bear in mind that due to time constraints there are only a limited number of UI languages currently available, and the initial translations have been made using Google Translate so there may be some odd mis-translations observed - we would be very grateful to receive feedback on any language issues identified, in order to improve the tool for yourselves and others.[Back to top]
The main table holds the current set of matches compiled; the row count in the bottom right corner of the table gives the count of currently expressed matches. The data is completely private to you - although the table data is automatically saved between browser sessions it is only saved locally to your current device/browser, and this data can easily be lost if the browser cache is cleared. It is therefore advised to export your work to an external (JSON) file frequently in order to properly back it up (or to transfer it between machines/browsers). You can do this simply by clicking the 'export JSON' button. Should the worst happen then you will have a recent copy of your data. If you want to use a different device/browser then export the current mappings to a JSON file, open this application on the new device/browser and import the previously saved file. Table cells edited during the current session (since the last data import or browser refresh) will appear temporarily highlighted so you can see what you have recently changed. If these highlights become distracting simply refresh the browser and these highlights will disappear.
A small grey triangle displayed within a column header indicates the table data can be sorted by the contents of this column; a black triangle indicates the current sort column. The current sort order is indicated by the orientation of the black triangle - upward pointing for ascending sort order; downward pointing for descending sort order. Clicking anywhere within the column header will toggle ascending / descending sorting on that column.
Column filters allow restricting the display to a subset of the overall matches. Typing a text value into the filter box will interactively restrict the displayed matches to records having column contents that contain the typed value. If a filter is active when a data export (JSON/CSV) is performed then only the filtered data will be present in the exported file - this is useful should you wish to export a smaller subset of the data for discussion/dissemination purposes.
It is possible to copy data from the table to the clipboard for use in other applications. First select the table - it is recommended to just click to the right hand side of the target concept field of any existing row (as this won't activate any other window or drop down). Typing CTRL+C will copy the entire table data to the clipboard (or in the browser menu select Edit - Copy). Note this will copy the entire table, not just the currently visible rows. If you would like to limit the data copied you can utilize the column filters described above to reduce the display to a subset of records - then only these records will be copied. The data is copied to the clipboard in tab-delimited text format. You may notice when pasting the data into another application (e.g. a spreadsheet) that there are a few additional columns present; these are internal timestamps of record creation and editing plus an internal record identifier. You may keep or discard this data as required.
You can also paste data from the clipboard directly into the table by typing CTRL+V (or in the browser menu select Edit - Paste). This is particularly useful to kick-start your mapping exercise based on an external textual listing of identifiers/terms. Pasting requires at least one existing row to be present - so for an initially empty table just click the 'add new row' button, click the empty grey area beneath the added row to ensure the table is selected, then press the CTRL+V characters on the keyboard. The pasting assumes the same column structure as the existing table - so if you only have a list of terms you would need some form of identifiers in the data copied to the clipboard, or just a blank column in order for the pasted data to go into the correct place in the table. Note that pasted data will be appended to any existing data already in the table. It is advisable to export any existing table data to a JSON file before attempting to perform this operation, if anything goes wrong then you can simply re-import the JSON file and restore the original data.
It is possible to use a combination of copying and pasting to perform external bulk editing operations on the table data if required. For example setting the source language for each row may be cumbersome; you can copy the table data and paste it into a spreadsheet, fill in the appropriate values for the source language column, then copy and paste the spreadsheet data back into the table (remembering to clear the existing table first to avoid data duplication).[Back to top]
This section represents concept/terms originating from the source vocabulary. We say concepts/terms as there may well be no formally structured original source vocabulary with concept identifiers, preferred terms etc. The tool can flexibly deal with this case where the source vocabulary may consist of a flat list of terms. For consistency we refer to the source entity as a 'concept' even where we only have a term or phrase expressed.
The local identifier of the source term/concept. Edit the value directly in the cell of the table. You should use a URI if one exists - identifiers entered as a URI will then display as a clickable link allowing you to navigate directly to that online address. This is useful in the case of Linked Open Data source vocabularies, allowing you to examine the context of the source concept in more detail - e.g. scope note, hierarchical structure etc. If no such URI exists, enter a unique concept identifier, and as a last resort if no such identifier exists you could just use the source concept label as an identifier.
The source term itself, or the preferred label of an identified source concept where a more sophisticated structured source vocabulary exists (e.g. a thesaurus).
Optionally state the native language of the source term by picking from the options in the drop down list. This information will assist the provision of multilingual search capabilities later in the project.[Back to top]
The purpose of the matching exercise idea is to make the most appropriate match from each concept/term in the source vocabulary to a concept in the AAT. Usually you will just make one match (the best one) for any given source concept - there is usually no need to express multiple relationships to AAT concepts as this is provided gratis via the AAT’s semantic structure. Thus if you make a match from a given source concept to an AAT target concept then there is no need to also make mappings to narrower AAT concepts for that given source concept. The exception is where the source concept relates to two genuinely different AAT concepts. In this case additional mappings are possible. The type of match between the source concept and the target concept will be one of the possible SKOS mapping properties (as defined by the SKOS reference document), listed here in order of preference:
- Exact Match - this match type indicates that there is "a high degree of confidence that the concepts can be used interchangeably across a wide range of information retrieval applications".
- Close Match - can be used to link concepts "that are sufficiently similar that they can be used interchangeably in some information retrieval applications". It is a more approximate relationship between the source and target concepts.
- Broad Match - expresses a hierarchical generic relationship between concepts. If a source concept is more specific in scope than any AAT available concept then you can make a Broad Match link to the AAT concept. This is useful for cases when a source vocabulary contains more detailed concepts. In this case the "all/some" rule should hold true e.g. "coffee cups" Broad Match "cups" - ALL coffee cups are cups; SOME cups are coffee cups.
- Narrow Match - expresses a hierarchical generic relationship between concepts. It is not expected that you would need to make much use of Narrow Match relationships for ARIADNEplus vocabularies. The "some/all" rule should hold true e.g. "cups" Narrow Match "coffee cups" - SOME cups are coffee cups; ALL coffee cups are cups.
- Related Match - expresses an associative relationship between concepts. The exact nature of the relationship is not specified, only that there is some "see also" type of connection between them. E.g. bullets Related Match guns. Bullets can be associated with guns, though you can see that this relationship is different to the other relationship types listed above. Preferably you would aim to (wherever possible) create a more direct concept mapping (e.g. to bullets).
The target vocabulary for subject mappings in the ARIADNEplus project is the Getty Art & Architecture Thesaurus (AAT). The displayed link is a concept from the Getty Vocabulary Program AAT Linked Open Data (LOD). Clicking on the link will take you to a page displaying all LOD properties for that concept. The displayed label is the preferred term of the target concept (in English). Bear in mind there may be many other (multilingual) labels attached to the concept.[Back to top]
This will display a search facility allowing you to search and display details of concepts present in the target AAT vocabulary. Typing into the search box and hitting the return key will display a list of matching concepts based on a preferred/alternate (multilingual) label match. By default ALL words in a phrase will be matched - e.g. searching for axe money will find axe and money. If you want to avoid this behaviour you can perform an exact phrase match by enclosing the search string in double quotes i.e. "axe money". You can also perform partial matches using a wildcard character (*) e.g. bullet* will find both bullets and bulletins
The results initially display the preferred term of the concept (in English), plus a shortened hierarchical ancestry for the concept giving further context. Hovering the cursor over any result will display the scope note for the concept. Note the actual match may be via one of the alternate terms (in any available language) rather than the displayed English term. It is generally advisable when making a mapping to take account of the full details of the AAT concept (such as the scope note and hierarchical context) rather than only relying on the match of the label.
Selecting any one of the results will display further details of the concept. These details include:
- Preferred term (English only) - the term used as a readable label for the concept in the application.
- A persistent Linked Open Data (LOD) identifier for the currently displayed concept. This identifier (URI) can be used in datasets to uniquely reference AAT concepts. The URI also resolves to a page displaying various properties of the concept.
- Navigable links to broader, narrower or related concepts. The hierarchical and associative structure of the AAT can be interactively explored and traversed via these links
- The scope note for the concept. The described scope of the concept helps to explain the circumstances when you would and would not use the concept, and occasionally suggests alternative concepts.
- The full hierarchical ancestry for the concept, illustrating its position in the overall polyhierarhical structure.
- Multilingual terms associated with the concept (grouped by language). You can search for concepts using any of these terms.
Note it is possible when interactively explored and navigating around the AAT structure to select a guide term. These are identifiable in the application as their label is enclosed in angle brackets (e.g. <church buildings by location or context>. Guide terms are used for grouping purposes and should not be used in matches as they are not concepts, although the application will not currently prevent you from doing so.[Back to top]
Clicking the 'delete' button located at the far right of any row will delete that particular row from the table. You will be prompted with an 'are you sure' message - to prevent accidental deletions.
Import a set of matches from a previously saved external JSON file. This can be useful if you want to continue working on a set of matches you have previously saved, or sets of matches produced by other people.
Export the current set of partial or complete matches from the table to an external (JSON) file. This is useful to save your work, or to send your matches to other people. The exported data file will incorporate the current (local) timestamp into the filename using the format "vmt-YYYYMMDDhhmmss.json" (e.g. "vmt-20190507122533.json"). Clicking the export JSON button occasionally during your work will create a trail of separate versioned backups, so you can easily go back to an earlier version and re-import that if required.
Export the current set of matches to a delimited text (CSV) format file, this is useful for subsequent import to e.g. a spreadsheet application. The exported data file will incorporate the current (local) timestamp into the filename using the format "vmt-YYYYMMDDhhmmss.csv" (e.g. "vmt-20190507122533.csv").
Add New Row
Adds a new empty row to the current set of matches. You can then edit the cells of the row directly in the table. The new row will be inserted as the first row of the table, so if you are not sure where your new row has gone, simply scroll to the top of the table.
This will clear ALL current matches from the table. You may wish to save the current set of matches to an external (JSON) file prior to doing this, so they can be loaded back in on this or another machine at a later date to continue working, or communicated to other people. You will be prompted before the data actually disappears, to prevent accidental deletions.
Displays this help page in a separate browser tab.[Back to top]