Embry-Riddle Aeronautical University logo

Tyler T. Procko

Resources

Main Semantic Science Resources Personal

Under construction...

Resources


This page contains Semantic Web and Linked Data resources useful for the researcher. In my experience in this domain, learning its constructs is not trivial. Despite the pretense at widespread interconnection, the Semantic Web community is fractured and confused; and its best resources are left in large technical documents, scattered across the Web, with broken links abounding. Moreover, its best practitioners are difficult to pinpoint. In this page, I intend to provide as much of my own insight as possible in this regard. If I can save one researcher even a minute, I am happy to do so. I am aware how daunting research in this domain is.

People


Michael K. Bergman

Co-founder of Cognonto and the founder of the KBPedia project; writer of many Semantic Web articles and books; his site is probably the best domain resource there is

Henriette Harmse

European Bioinformatics Institute; hosts a good Semantic Web blog, relevant and responds as of 2023; experienced in DLs

Barry Smith

Philosopher turned OWL ontologist; primary creator of the Basic Formal Ontology (BFO) and the OBO Foundry

Ontology Repos


OBO Foundry

Open Biological and Biomedical Ontology Foundry; home of BFO and all BFO-compliant ontologies; mostly biomedical in nature

Ontobee

The default server for most OBO ontologies; hosts a lot of ontologies not found on the OBO Foundry

AberOWL

Semantic search engine for ontology-based access to biological data

Ontology Lookup Service (OLS)

A server for ontological biology data

BioPortal

Self-described "world's most comprehensive repository of biomedical ontologies"

Ontohub

A central aggregate hub for ontology repositories (like BioPortal) and their ontologies

Linked Open Vocabularies (LOV)

The most popular hub for Linked Data and Linked Data vocabularies; no focus on ontologies and DLs like the biology repos

DBPedia Archivo

DBPedia's automatic Web crawler service that finds OWL ontologies on the Web and rates them based on their 5-star rating scheme; the "back-end" service of DBPedia's Databus

Industrial Ontologies Foundry (IOF)

A project of Barry Smith's and the OBO Foundry; a BFO-hub-and-spokes ontology hub for industrial ontologies, as opposed to the biomedical focus of the OBO Foundry

Industry Portal

Ontologies in various industry domains

Upper Ontologies


What is an upper ontology?

An upper ontology (also called a top-level or foundational ontology) is a high-level, domain-independent ontology that provides basic categories and relationships applicable across all domains. These ontologies serve as a foundation for more specific domain ontologies, ensuring interoperability and consistency across different knowledge domains.

Basic Formal Ontology (BFO)

Developed by Barry Smith and the IFOMIS group; the most widely used upper ontology in biomedical and scientific domains; serves as the top-level ontology for the OBO Foundry; distinguishes between continuants (entities that persist through time) and occurrents (processes and events)

DOLCE (Descriptive Ontology for Linguistic and Cognitive Engineering)

Developed by the Laboratory for Applied Ontology (LOA) in Italy; focuses on cognitive and linguistic aspects; emphasizes the notion of ontological dependence; part of the WonderWeb project

SUMO (Suggested Upper Merged Ontology)

One of the largest and most comprehensive upper ontologies; includes over 20,000 terms and 80,000 axioms; freely available and mapped to WordNet; developed by the IEEE working group

UMBEL (Upper Mapping and Binding Exchange Layer)

Developed by Michael K. Bergman and Structured Dynamics; designed to integrate with OpenCyc and serve as a vocabulary for Linked Data; focuses on practical mappings between various knowledge bases

Cyc and OpenCyc

One of the oldest and most ambitious AI projects; contains millions of assertions about common-sense knowledge; OpenCyc is the open-source version; now maintained as ResearchCyc

Common Core Ontologies (CCO)

A BFO-conformant suite of mid-level ontologies; developed by CUBRC with support from various government agencies; focuses on information entities, agents, events, and artifacts; used extensively in defense and intelligence domains

GFO (General Formal Ontology)

Developed at the University of Leipzig; integrates notions of processes, objects, and time; distinguishes between different ontological levels; particularly strong in temporal modeling

YAMATO (Yet Another More Advanced Top-level Ontology)

Developed by Riichiro Mizoguchi's group in Japan; focuses on role concepts and context-dependent relationships; particularly useful for engineering and design domains

Coterie


What is a coterie?

A coterie is a small group of people with similar interests. Because of the confusing nature of the Semantic Web, I have found some related "resource dumps". These are given here.

Resource Collections

Awesome Semantic Web

The most exhaustive and up-to-date list of Semantic Web resources, covering everything from standards to code libraries to companies; in the form of an "awesome list" in a Github repo

Companies and Consulting Firms

Semantic Arts

Leading semantic technology consulting firm founded by Dave McComb; specializes in enterprise data architecture, semantic modeling, and knowledge graph implementation; known for developing the Data-Centric Revolution methodology and semantic standards for enterprise integration

TopQuadrant

Founded by Holger Knublauch and Ralph Hodgson; developers of TopBraid suite of semantic tools; pioneers in SHACL development and adoption; provide enterprise semantic solutions and data governance platforms

Ontotext

Bulgarian-based company specializing in semantic technology and knowledge graph solutions; developers of GraphDB, a leading RDF database; provide text analytics and knowledge discovery platforms

Cambridge Semantics

Enterprise knowledge graph platform provider; developers of Anzo, a semantic data integration and analytics platform; focus on data fabric and enterprise data management solutions

Metaphacts

German company providing knowledge graph platforms and semantic applications; developers of metaphactory, a low-code platform for building knowledge graph applications; strong focus on enterprise semantic solutions

Academic and Research Groups

AKSW (Agile Knowledge Engineering and Semantic Web)

Research group at Leipzig University led by researchers including Jens Lehmann; focus on knowledge graphs, ontology learning, and semantic web applications; developers of numerous open-source semantic tools

Request For Comments (RFC) Series


What is an RFC Series? RFC stands for Request for Comments. The first RFC Series came about with the advent of the Internet as a result of the ARPANet project. It is, more or less, a series of memos, notes and technical documents intended at documenting the history of Internet projects. I would use the term "blog", but its connotations for me are unprofessional.

Semantic Web


Why did the Semantic Web fail?

Tim Berners-Lee has a new project for personal data governance called the Solid Project. In the FAQs page of that project (archive here), the Solid team mentions that Linked Data and the Semantic Web 'never took off'. I have researched in this domain for some time now. Through my research, and discussions with other expert practitioners (e.g., Nicholas del Rio, Jarno van Driel), it is apparent that the reason for the distinct failure of the Semantic Web as an ideal is due, in part, to the fact that the adoption of the W3C's standards was not clear or straightforward for data providers on the Web - it simply was not easy to do. The working groups pushing for the Semantic Web were primarily composed of academic types, not business people or developers, and so the more complex constructs, e.g., OWL, were not ever realized to any great extent on the Web. Also, the advent of social media sites replaced the vision of the Web 3.0 as a Web of interlinked concepts with interactive websites. That being said, the use of Linked Data is common in search engines, e.g., with the Schema.org vocabulary, which is used along with other microdata formats for more accurate searching, recommendations, etc.

OWL and DLs


What is OWL good for?

OWL is a manifestation of Description Logics (DLs); there are various DL profiles for OWL. For most use cases, OWL is exceptional overkill. Inasmuch as OWL is a DL implementation, it abides by the open-world assumption (OWA), which is extremely difficult for most people to reconcile with anything else they do, e.g., OOP, which is closed-world by nature. For example, in Java, we may define a class Human but never give it an attribute for a brain. In closed-world logic, it is inferred that the class of Humans do not possess brains. In an OWL Human class, unless it is explicitly stated that Humans do NOT have brains with an OWL constraint, it cannot be inferred that they do not. In other words, OWL never makes inferences without explicit assertions. This makes understanding its inferences on larger ontologies difficult; even veteran OWL modelers struggle to explain inferences. OWL is a very heavy-handed, complicated modeling language; and, unlike RDF / RDFs, which are rather simple, easily extended and only offer limited reasoning (e.g., subsumption inferencing), OWL is very difficult to extend, because, with each new OWL constraint added, the complexity of the OWA makes erroneous inferences exceptionally hard to diagnose. In my experience, OWL is only used to its full extent in the biomedical domain. Most people who work with an "OWL ontology" do not even come close to fully utilizing the DL underlying OWL: for the most part, they implement a class/relationship hierarchy (this can be done with RDFS), add annotation properties and perhaps a few necessary conditions (subClassOf constraints), and then they never run a reasoner. This is not an OWL ontology. In any case, for most use cases in business, OWL is far too complicated. Furthermore, OWL cannot be used to validate instance data against the ontology, unless one writes very specific SPARQL queries to do so; and, even then, "bad" instance triples cannot be rejected without a specific codebase written for this purpose. A language like the Shapes Constraint Language (SHACL) is a much more appropriate means of ontological modeling for business use cases, because every major graph database platform supports SHACL validation and it is a closed-world language. SHACL can be used to define an ontology by shapes, that is then used automatically on data ingestion to validate new triples. Inveterate OWL modelers like the founder of TopQuadrant have abandoned OWL in favor of SHACL. So, I point everyone I can to SHACL as an alternative to OWL.

AI


How do you reconcile AI/ML with Semantic Web constructs like ontologies?

This will remain as a rather informal response. In the historical sense, ontologies and ML went hand-in-hand: ontologies were fed into ML, and vice versa. For instance, IBM's Watson touted ontology-based ML. In any case, as it stands now, in 2023 and beyond, ontologies are nearly irrelevant in practical use and the research landscape. A Google search for 'Python ML' will return thousands of posts from the same week; a Google search for 'Python ontology' will return posts from over a decade ago. Ontologies are complicated artifacts espoused by very few, while ML consistently explodes in popularity and use. Not many individuals can put forth the effort to abandon the closed-world logic driving literally everything we do as humans to fully comprehend the open-world logic of description logics, which ontologies (at least, OWL ontologies) subscribe to. But, collecting large amounts of data, training a model, getting a resultant vector matrix and using it? That is more approachable. And the research landscape is hot, with new articles being published literally by the minute. There simply is not the support for ontology work. Steadfast ontologists like Barry Smith are probably injured (understandably) by the exploding popularity of undocumented, unexplainable ML. Veterans of the Semantic Web space like Jens Lehmann have, in a sense, tipped the proverbial hat to the rise of ML and general AI. Whatever the thinking, ontologies still have some purpose. For instance, the heart of an ontology, its taxonomy, is essential in everything: models need to be classified to be found and used, as does training data. But for ontologists, true ontology work seems to be a concern only for biomedical clients. It's a bad game to be in.