Introduction
The maritime industry crosses all disciplines. There is more to it than shipping, just as there is more to shipping than the carriage of goods by sea. Besides logistics and transporation, the sector's knowledge touches upon trade, law, government, oceanography, marine biology, aquaculture, tourism, energy, climatology, education, technology, security, safety, navigation, international relations, economics, finance, insurance, engineering, Greek history and Norse literature, not to mention ships and seafaring.
Librarians supporting maritime research can not hope to purchase all the commercial information packages needed to satisfy the breadth of research questions they encounter. Fortunately, what is needed more than free content is free access to a maritime reference source, which indexes all relevant content, and only relevant content, whether that content has to be paid for after it has been discovered or is accessible on the free web.
This index does not exist. Even if it were theoretically possible to complete it, it would be out of date the next instant new information was avaiable, which is to say, immediately. The vastness of the existing content that has not been marked, and the relentless torrent of new content, necessitates the assistance, and persistence, of the crowd. The "crowd-sourced" strategy we have in mind approaches the challenge on two fronts:
On one front, librarians in the maritime community will use the API's of publication "aggregators" to extract meta-data, from which can be derived relevant title, author, and topical lists to share among the maritime research community. It should be in the interest of these commercial information vendors to support rather than oppose our mining of data from their systems, even if the resulting index competes with the search interfaces that they offer in their own web sites. After all, a community-maintained index that is constantly improving in comprehensiveness and relevance has the potential to bring them more pay-per-view sales of the actual content being sought, regardless of where it was discovered.
Meanwhile, on the second front, maritime professionals can assist the professional librarian by bookmarking useful content that is typically not contained in the aggregations of the big players in academic publishing. With the click of the Delicious button on their toolbar, they can add to a common knowledge base, directing librarians to single title web publications covering a special topic, or to reports and news from organizations with an interest in maritime affairs. The only criteria is that bookmarks be potentially relevant to a topic of maritime research.
We view this as the start, the seed, if you will, of something that has the potential to grow much larger. The collection of bookmarks will be a useful reference tool in itself, but the content they point to can be be funneled through more sophisticated text-extraction and analysis tools. The extracted information can be added to the common index after it has been "bibliographized" in something FRBR-esque or Dublin Core-ish, which is library speak for highly organized schemes to store and retrieve data in research systems. Social bookmarking sites, even those as feature rich as Delicious, are not yet optimized for author headings or publication details, but they are excellent places to glean sources of useful information.
In addition to contributing relevant bookmarks, donors (of their time) can be of even greater service adding "tags" to their bookmarks, preferably using the terminology common in their area of expertise. To this point, we have only been bookmarking at the level of whole sites, adding tags and inserting brief descriptions of their use. If the site being bookmarked has a statement of its purpose or mission, we use that for our description. In theory there is no reason that individual articles and reports could not be bookmarked in this way as well. In any event, by deploying such a straighforward, unobtrusive tool as the Delicious Bookmarks button on the screens or maritime practicioners, we believe the information collected in the Maritime Stack will reap a rich harvest to pass along to librarians in the service of maritime interests.
But first, we have to find and identify the relevant content.
Our purpose is to aggregate maritime content. As mentioned in the introduction, the librarian can not accomplish this without the contribution of workers and researchers in the maritime sector. The Delicious bookmarking site referenced in the introduction is only one of many places where contributions can be made. Our appeal applies to other social bookmarking sites that contributors might prefer to use. We have set up equivalent bookmark sharing opportunities in Bibsonomy, CitesUlike, and Connotea, but any other site that can offer a set of shareable bookmarks pertaining to maritime affairs is eligible for inclusion. The critical point is that librarians serving maritime interests have a means of gleaning data for the common maritime index, and that the method or methods for collecting practical information comport with the work and information flows of the maritime practitioners and researchers.
For ease of use, the service used should offer a tool bar button compatible with the user's preferred browser. When users come upon a site or an article of interest, they should be able to click the button on their toolbar and select the "stack" or group that is being shared for the purpose -- Maritime Stack, Maritime Information, etc., and add topical "tags" as appropriate.
A bookmark can, and usually does, have more than one topic. It is generally advisable to use multiple small terms in preference to tags that are overly elaborate. For instance, "Acts of piracy resulting in the taking of hostages" would be better broken up into smaller facets that when combined will help users "drill down" to the specific article, e.g., "Piracy," and "Hostages." Avoid hierarchical tagging, such as, "Newbuildings--Finance--Denmark." Instead, use "Newbuildings," "Ship finance," and "Denmark."
While simple one or two word tags are the rule, a longer or more complex term that is commonly known to the profession is to be preferred over multiple facets that have been artificially simplified. Use "Collisions at sea," not "Collisions," and "Sea"; or "P&I Clubs," not "Marine insurance" and "Cooperative associations." ("Marine insurance" or "Marine insurers" will be perfectly applicable tags in many instances, but for works discussing specifically those mutual insurance associations that provide member coverage for broader, indeterminate risks and third party liabilities, "P&I clubs" is preferred for its precision. In some bookmarks, tags for both P&I clubs and Marine insurance could be justified, but the point of this example is not to avoid use of the former for not being a simpler, generic term).
Thoughtfully-tagged "social bookmarks" will be useful for the researcher in maritime affairs, but just as importantly, they offer supporting librarians a list of resources from which to derive and maintain a thesaurus of authoritative terminology, author headings, places names and cross references. Tagging is not an exact science, and does not need to be, but the more the consistency in the terms used, the easier the task of building an aggregate index of the vast ocean of maritime data that is available on the web
Most bookmarking sites will suggest tags that other users have made. When they fit, use them. Doing so will aggregate the bookmarked content of more than one contributor, which is the point of the project in the first place.
To repeat, contributors have their choice of bookmarking sites. We have already implemented stacks or groups for the following:
All of these bookmarking sites have their advantages and disadvantages, but the fact that they are easy to use, provide a toolbar button for the browser, and allow sharing and tagging, recommend them for the project. In fact, Bibsonomy, CiteULike and Connotea are particularly commendable for supporting specific tagging fields for author names and other bibliographic information. However, what might be more convenient for librarians might require too much effort on the part of the bookmark contributor, in which case, use Delicious. If there is another bookmarking site more suitable, more convenient for the maritime professional, please have your librarian or information officer get in touch with us. The important thing is that you, the maritime professional or maritime researcher, help us identify the content that needs to be harvested. We support just about anything that will make the job easier for you.
Role of the Maritime Librarian
The librarian's role in gathering information for the common index is two part: The first is to extract "meta-data" pertaining to maritime affairs from the major bibliographic aggregators -- ProQuest, Ebsco, Ingenta, Informa, Springer, Ebrary, OCLC, etc. The second involves crawling the sites, pages and documents that have been "socially bookmarked" by contributors from the maritime profession.
The commercial aggregation sites have the advantage of ready to use metadata (data about the data). Search results can be restricted by topic, author, date, publisher, and so forth. The same criteria can be used to generate browsing lists. Cross-references can direct users to authoritative terminology, to the changes of title in serialized publications, to variations in personal, corporate and place names, etc. Citation information can be exported and inserted into footnotes or endnotes or bibliographies within word processing documents.
Despite their many advantages, however, the large commercial databases and indexes of academic content have at least two weaknesses when it comes to maritime research. The first is a lack of contextual specificity, for want of a better term. Large academic databases typically offer content packages for lease that are based on broad classes of knowledge, such as Business & Economics, Science and Engineering, Arts & Humanities, Law and Politics, and Social Sciences. They might even break these classes down into much smaller categories. But as already noted in the introduction, the breadth of maritime inquiry defies such easy categorization. A maritime package, if such a thing existed, would include journals and book titles from virtually all knowledge categories. Indeed, it would include single articles from journals and individual chapters from books that otherwise have nothing to do with maritime research.
In addition to burdening maritime libraries with paying for far more content than their patrons are likely to use, the big commercial academic resources of necessity bury the documents sought by maritime researchers under a mound of irrelevant content. Again, the problem arises from a lack of contextual specificity. A general database uses general subject terms and general classification schemes. A librarian responsible for the traditional MARC cataloging of maritime content knows all too well the frustrating imprecision of Library of Congress Subject Headings, or the vague guidance of the major classification systems which has us shelving works on a wide disparity of maritime topics under Transportation. This is not to say that these general thesauri and classification systems do not have their place, but there is a crying need to supplement them with vocabularies and categories familiar to practitioners and researchers in the maritime world.
This can be accomplished if the subset of meta-data relevant to maritime research is harvested from such databases and combined in a new aggregation, a common maritime index. The politics of extracting this metadata from commercial resources must be kept separate from the politics of extracting the content itself. The Maritime Librarian will have abundant opportunities to take up the clarion call for Open Access against traditional publishers, whose profits have risen in reverse proportion to their diminished role in making knowledge public. The effort and expenses borne by researchers and their institutions is being routinely surrendered to middlemen who add little beyond the prestige of their name, but who, in selling the work back to the research community, can name their price. But this is not the occasion to fight that fight. To conflate the issue of freely-harvested metadata with the issue of freely accessible content risks needlessly antagonizing potential allies in the campaign to build a common maritime index. After all, an index that better serves the specific needs of a particular research community will point users to the content they seek, regardless of where it resides and what is required to access it. And so, as necessary, it will be the job of the librarian to assure the information vendor that competing search interfaces are merely additional paths to the content they are selling.
The second weakness of the big academic databases is that they overlook a sea of useful maritime content that is not included, partly because it is not considered sufficiently scholarly, but also because much of the content is published by institutions that are unaware, or unable to take advantage, of the benefits of aggregation services. Consequently, their works might show up in internet search results -- Google, Yahoo and Bing, etc.-- but getting them to the surface often requires already knowing the document title, if not also the author and date. It might even require knowing an exact string of words within the text to distinguish it from hundreds or thousands of other works. Thus, when a researcher is trying to comb the entire internet for works on a particular topic, the discovery of documents without "bibliographic" metadata is a hit or miss proposition at best. With the unfortunate state of the data as it is today, it is challenging to get comprehensive results in searches and news feeds without sacrificing a high degree of relevance. It is not uncommon, for example, for the researcher to have to wade through articles about Peyton Manning's passing statistics, minutes of the Provincial Planning Committee of the Maritime Provinces, or the bunker delay in a professional golf tournament.
Identifying the subset of useful maritime data, from among all the available information on the web, increases relevance. Indexing that data with author, title, and publication information, and assigning subject terminology, increases the "discoverability" of individual works within that subset. Merging these indexes with the metadata extracted from the commercial academic aggregators increases comprehensiveness. The details of how to get from here to there need to be worked out in collaboration with other librarians and their systems developers, but we know enough of how this process will work to begin with the first steps. The maritime librarians will begin the metadata extraction from existing aggregators while the maritime professionals and maritime researchers begin identifying and tagging content on the open web. The librarians will develop and deploy tools to glean meta-data from the "raw" web pages, PDF's, etc., and add it to the common maritime index.
It is hoped that, over time, those who benefit from better tools of maritime research will prevail upon their colleagues and their counterparts on the publication side to register their products with the common maritime index. By that time the librarians should have self-help tools and an online workflow in place to ensure that quality metadata is added to the index at the time of publication. Publishers of maritime information should feel free to register with other aggregation services as their interests require. We are not in competition with them. However it gets accomplished, our goal is a comprehensive index of information relevant to maritime research.
The librarian's role in gathering information for the common index is two part: The first is to extract "meta-data" pertaining to maritime affairs from the major bibliographic aggregators -- ProQuest, Ebsco, Ingenta, Informa, Springer, Ebrary, OCLC, etc. The second involves crawling the sites, pages and documents that have been "socially bookmarked" by contributors from the maritime profession.
The commercial aggregation sites have the advantage of ready to use metadata (data about the data). Search results can be restricted by topic, author, date, publisher, and so forth. The same criteria can be used to generate browsing lists. Cross-references can direct users to authoritative terminology, to the changes of title in serialized publications, to variations in personal, corporate and place names, etc. Citation information can be exported and inserted into footnotes or endnotes or bibliographies within word processing documents.
Despite their many advantages, however, the large commercial databases and indexes of academic content have at least two weaknesses when it comes to maritime research. The first is a lack of contextual specificity, for want of a better term. Large academic databases typically offer content packages for lease that are based on broad classes of knowledge, such as Business & Economics, Science and Engineering, Arts & Humanities, Law and Politics, and Social Sciences. They might even break these classes down into much smaller categories. But as already noted in the introduction, the breadth of maritime inquiry defies such easy categorization. A maritime package, if such a thing existed, would include journals and book titles from virtually all knowledge categories. Indeed, it would include single articles from journals and individual chapters from books that otherwise have nothing to do with maritime research.
In addition to burdening maritime libraries with paying for far more content than their patrons are likely to use, the big commercial academic resources of necessity bury the documents sought by maritime researchers under a mound of irrelevant content. Again, the problem arises from a lack of contextual specificity. A general database uses general subject terms and general classification schemes. A librarian responsible for the traditional MARC cataloging of maritime content knows all too well the frustrating imprecision of Library of Congress Subject Headings, or the vague guidance of the major classification systems which has us shelving works on a wide disparity of maritime topics under Transportation. This is not to say that these general thesauri and classification systems do not have their place, but there is a crying need to supplement them with vocabularies and categories familiar to practitioners and researchers in the maritime world.
This can be accomplished if the subset of meta-data relevant to maritime research is harvested from such databases and combined in a new aggregation, a common maritime index. The politics of extracting this metadata from commercial resources must be kept separate from the politics of extracting the content itself. The Maritime Librarian will have abundant opportunities to take up the clarion call for Open Access against traditional publishers, whose profits have risen in reverse proportion to their diminished role in making knowledge public. The effort and expenses borne by researchers and their institutions is being routinely surrendered to middlemen who add little beyond the prestige of their name, but who, in selling the work back to the research community, can name their price. But this is not the occasion to fight that fight. To conflate the issue of freely-harvested metadata with the issue of freely accessible content risks needlessly antagonizing potential allies in the campaign to build a common maritime index. After all, an index that better serves the specific needs of a particular research community will point users to the content they seek, regardless of where it resides and what is required to access it. And so, as necessary, it will be the job of the librarian to assure the information vendor that competing search interfaces are merely additional paths to the content they are selling.
The second weakness of the big academic databases is that they overlook a sea of useful maritime content that is not included, partly because it is not considered sufficiently scholarly, but also because much of the content is published by institutions that are unaware, or unable to take advantage, of the benefits of aggregation services. Consequently, their works might show up in internet search results -- Google, Yahoo and Bing, etc.-- but getting them to the surface often requires already knowing the document title, if not also the author and date. It might even require knowing an exact string of words within the text to distinguish it from hundreds or thousands of other works. Thus, when a researcher is trying to comb the entire internet for works on a particular topic, the discovery of documents without "bibliographic" metadata is a hit or miss proposition at best. With the unfortunate state of the data as it is today, it is challenging to get comprehensive results in searches and news feeds without sacrificing a high degree of relevance. It is not uncommon, for example, for the researcher to have to wade through articles about Peyton Manning's passing statistics, minutes of the Provincial Planning Committee of the Maritime Provinces, or the bunker delay in a professional golf tournament.
Identifying the subset of useful maritime data, from among all the available information on the web, increases relevance. Indexing that data with author, title, and publication information, and assigning subject terminology, increases the "discoverability" of individual works within that subset. Merging these indexes with the metadata extracted from the commercial academic aggregators increases comprehensiveness. The details of how to get from here to there need to be worked out in collaboration with other librarians and their systems developers, but we know enough of how this process will work to begin with the first steps. The maritime librarians will begin the metadata extraction from existing aggregators while the maritime professionals and maritime researchers begin identifying and tagging content on the open web. The librarians will develop and deploy tools to glean meta-data from the "raw" web pages, PDF's, etc., and add it to the common maritime index.
It is hoped that, over time, those who benefit from better tools of maritime research will prevail upon their colleagues and their counterparts on the publication side to register their products with the common maritime index. By that time the librarians should have self-help tools and an online workflow in place to ensure that quality metadata is added to the index at the time of publication. Publishers of maritime information should feel free to register with other aggregation services as their interests require. We are not in competition with them. However it gets accomplished, our goal is a comprehensive index of information relevant to maritime research.
Great idea! the whole business, research, publishing and educational maritime industry must come together and organise all the information, data, and research related to our industry and make it accessible to everyone interested in it.
ReplyDelete