DBpedia Explorer (WADe)

A modular Web tool (SPA) for exploring large knowledge collections through DBpedia, driven by SPARQL and enriched with RDFa/JSON-LD and an extension system.

Author: Teodor Socea
Course: Web Application Development (WADe)
Keywords: DBpedia, SPARQL, RDF, RDFa, JSON-LD, SKOS, Semantic Web, Single Page Application, Microservices

Repository / Deployment:

Abstract

This report describes DBpedia Explorer (WADe), a service-oriented Web application that provides interactive visual exploration of a large knowledge collection. The system is built as a Single Page Application (SPA) plus a FastAPI backend that translates user tasks into SPARQL queries executed against the public DBpedia endpoint. The client renders search, category navigation (broader/narrower), entity inspection, and multiple extensions (facets, category map, related entities). The UI is enriched with RDFa and JSON-LD, and each resource of interest is shareable via URL and QR code.

1. Introduction

Large knowledge collections such as DBpedia provide rich RDF data but require expertise to query and interpret. This project bridges that gap by offering a human-friendly SPA that exposes core exploration tasks while preserving machine-readable semantics.

The main goals are: (1) enable useful visual navigation over DBpedia categories and entities, (2) provide modular extensions demonstrating non-trivial operations, and (3) follow Web engineering practices (service-oriented architecture, documentation, reproducible deployment).

2. Requirements & Deliverables Mapping

WADe requirement How the project satisfies it
Modular Web tool (SPA or extension) with visualizations React SPA with multiple UI faces and 3+ extensions (facets, category map, related entities).
Knowledge model expressed in SKOS/OWL, accessed via SPARQL DBpedia categories treated as SKOS-like concepts; broader/narrower navigation; SPARQL queries against DBpedia endpoint.
At least 3 extensions (filtering, layered visualization, trend/comparison, etc.) Type facets (intelligent filtering), Category map (layered local graph), Related entities (comparison/similarity via shared categories).
Service-oriented architecture + design docs SPA + FastAPI + DBpedia endpoint, documented in Section 4 (Falsehoods, UI Faces, C4, Design Docs).
Client representations include HTML5 + schema.org + RDFa Pages include RDFa attributes and JSON-LD injection (schema.org) for category/entity pages.
Each resource exposed by URL and/or QR code ShareBar component provides copyable URL + QR code for category/entity pages.
Fully implemented and deployed solution Dockerized containers deployed on Google Compute Engine with Docker Compose.

3. Data Sources & Knowledge Model

3.1 External data source: DBpedia

The application uses the public DBpedia SPARQL endpoint as a read-only source of RDF data. Core interactions are implemented as SPARQL queries.

3.2 Knowledge model reuse (SKOS/OWL/vocabularies)

DBpedia categories are used as knowledge organization units. Category pages are presented as SKOS-like concepts, and relationships are exposed using RDFa: broader categories and narrower categories correspond to hierarchical navigation.

Note: The project intentionally reuses an existing public knowledge base (DBpedia) rather than creating a new taxonomy from scratch.

4. Architecture & Design (Falsehood • UI Faces • C4 • Design Docs)

4.1 Falsehoods

Mitigations include defensive UI rendering, strict pagination, limited result sets, and tolerant handling of empty/partial data.

4.2 UI Faces

4.3 C4 Model

C1 — System Context

Users interact with the SPA in a browser. The SPA communicates with the backend API. The backend queries the DBpedia SPARQL endpoint.

C2 — Container Diagram

Figure 1 — C4 (Level 1) System Context. End users interact with the SPA, which calls the backend API; the backend queries the external DBpedia SPARQL endpoint.
User Web browser DBpedia Explorer (WADe) SPA + API (service-oriented) Provides search, category navigation, entity inspection, extensions, and semantic annotations (RDFa/JSON-LD). DBpedia Public SPARQL endpoint Uses HTTP(S) Queries SPARQL
Figure 2 — C4 (Level 2) Container Diagram. Main runtime containers and protocols.
React SPA Vite + TypeScript • Routes: Search / Category / Entity • Extension registry (slots) • RDFa + JSON-LD + QR sharing FastAPI Backend REST API • /api/search (pagination + totals) • /api/category/* (broader/narrower) • /api/entity/* (details + related) • SPARQL generation + response shaping DBpedia SPARQL endpoint • RDF dataset • Public service HTTP JSON SPARQL JSON results
Figure 3 — Request/Data Flow (Search). End-to-end steps from user action to rendered results.
1) User Submits a query (e.g., “physics”) Selects kind + page 2) SPA GET /api/search?q=…&limit=…&offset=… Renders JSON results + pagination 3) API + DBpedia API builds SPARQL and queries DBpedia Returns JSON bindings → compact JSON

4.4 Design Docs

Extension system

The SPA supports a slot-based extension registry. Extensions declare an identifier, a target slot, and a render function. This enables modular functionality such as facets, category map, and related entities.

Input/output data formats

Data/task flow example (Search)

  1. User enters query and submits.
  2. SPA sends GET /api/search with q/kind/limit/offset.
  3. Backend constructs SPARQL query and sends it to DBpedia.
  4. Backend transforms bindings into compact JSON results.
  5. SPA renders results and pagination.

5. API Design (REST)

The backend exposes a REST API used by the SPA. Endpoints are stateless and return JSON. (If you include OpenAPI examples later, link screenshots or paste representative fragments.)

5.1 Endpoint overview

Endpoint Purpose Notes
GET /health Health check Used for deployment verification
GET /api/search Search entities/categories Supports kind, limit, offset; returns totals
GET /api/category/{id} Category details Broader/narrower + entity count
GET /api/category/{id}/entities Entities in category Paginated list
GET /api/category/{id}/facets/types Type facets Top rdf:types (best-effort)
GET /api/entity/{id} Entity details Label, abstract, types, categories
GET /api/entity/{id}/related Related entities Similarity via shared categories (heuristic)

5.2 Input/output examples

Search request: /api/search?q=physics&kind=all&limit=20&offset=0

Search response (example):

{
  "query": "physics",
  "kind": "all",
  "limit": 20,
  "offset": 0,
  "count": 20,
  "total": 161,
  "entityTotal": 90,
  "categoryTotal": 71,
  "results": [
    { "uri": "http://dbpedia.org/resource/Category:Physics", "label": "Physics", "kind": "category" }
  ]
}

6. SPARQL Queries of Interest

The backend composes SPARQL queries to implement search, navigation, and extensions. Below are representative examples.

6.1 Search (entities/categories)

PREFIX rdfs: 
PREFIX dct:  

# NOTE: pseudocode — the backend uses safe formatting + limits
SELECT ?uri (SAMPLE(?label) AS ?label)
WHERE {
  # ... entity or category patterns ...
  ?uri rdfs:label ?label .
  FILTER(lang(?label) = "en")
  FILTER(CONTAINS(LCASE(STR(?label)), LCASE("physics")))
}
GROUP BY ?uri
LIMIT 20
OFFSET 0

6.2 Category membership (dct:subject)

PREFIX dct: 

SELECT ?entity
WHERE {
  ?entity dct:subject  .
}
LIMIT 25

6.3 Type facets (rdf:type counts)

PREFIX dct: 
PREFIX rdf: 

SELECT ?type (COUNT(DISTINCT ?e) AS ?count)
WHERE {
  ?e dct:subject  .
  ?e rdf:type ?type .
}
GROUP BY ?type
ORDER BY DESC(?count)
LIMIT 15
Pragmatic note: Some DBpedia categories yield sparse or noisy results due to data modeling; the UI and extensions treat these as best-effort.

7. Extensions (3+)

7.1 Extension 1 — Type Facets (Intelligent filtering)

Displays the most common rdf:type values among entities in the selected category. Users can select types and apply a filter. This demonstrates intelligent filtering driven by SPARQL aggregation.

7.2 Extension 2 — Category Map (Layered visualization)

Visualizes a local neighborhood of the current category (broader and narrower links), enabling quick contextual navigation. This demonstrates a layered visualization for hierarchical knowledge structures.

7.3 Extension 3 — Related Entities (Comparison/similarity)

Suggests entities related to the current entity by computing overlap of categories. This provides a lightweight similarity function for exploration.

8. Use of DBpedia SPARQL, SKOS, and OWL

This project relies on the public DBpedia SPARQL endpoint as its primary knowledge source and reuses existing Semantic Web vocabularies rather than defining a new ontology. The goal is to demonstrate pragmatic reuse of established knowledge models in a real Web application.

8.1 DBpedia SPARQL endpoint

All knowledge presented in the application is retrieved dynamically from DBpedia via SPARQL queries. The backend acts as a controlled intermediary that:

Typical SPARQL patterns used include:

Official documentation:

8.2 Use of SKOS for category navigation

DBpedia categories are treated as concept-like resources and exposed in the UI following SKOS (Simple Knowledge Organization System) principles.

Although DBpedia categories are not explicitly typed as skos:Concept in all cases, they naturally form a hierarchical knowledge organization that aligns well with SKOS semantics.

In the application:

This enables both human-friendly exploration and machine-readable interpretation of category hierarchies.

Official SKOS documentation:

8.3 Use of OWL and RDF types

OWL and RDF are used primarily to expose and aggregate semantic typing information for entities. DBpedia entities are associated with multiple rdf:type assertions, typically drawn from the DBpedia Ontology (dbo: namespace).

In the application, rdf:type information is used to:

While full OWL reasoning is outside the scope of the project, the reuse of OWL-based vocabularies demonstrates how lightweight semantic typing can enhance exploratory interfaces.

Official documentation:

9. Linked Data Principles

10. Deployment

The system is deployed as Docker containers (SPA + API) orchestrated with Docker Compose on Google Compute Engine. Images are published to GitHub Container Registry (GHCR) and pulled by the VM.

10.1 Deployment steps (high level)

  1. Build and push images to GHCR (private).
  2. Provision a GCE VM and install Docker + Compose.
  3. Authenticate the VM to GHCR using a token.
  4. Run docker compose pull and docker compose up -d.

Verify deployment via /health and by interacting with the SPA.

11. User Guide & Case Studies

This section provides a user-oriented guide to the DBpedia Explorer (WADe) application. The goal is to demonstrate how end users can explore, understand, and compare large knowledge collections without prior expertise in Semantic Web technologies. The guide is structured around three concrete case studies, each corresponding to a common exploration task.

Case Study 1 — Exploratory Search and Result Navigation

User goal. Obtain an overview of entities and categories related to a general topic and navigate large result sets efficiently.

Context. Large knowledge bases such as DBpedia contain thousands of resources for common topics. Presenting all results at once would overwhelm users. The search interface therefore focuses on progressive disclosure, pagination, and clear visual differentiation between result types.

Steps.

  1. The user opens the landing page of the application.
  2. The user enters a query (e.g., physics) and explicitly initiates the search.
  3. The system displays a limited number of results per page, visually distinguishing entities from categories.
  4. The user navigates between pages to explore additional results.
Search results for the query "physics", showing both categories and entities, along with pagination controls.

Outcome. The user gains an immediate high-level understanding of the knowledge space related to the query and can decide which results are worth deeper exploration. Pagination and result counts support orientation and prevent cognitive overload.

Case Study 2 — Category Exploration with Hierarchy and Faceted Filtering

User goal. Explore the internal structure of a knowledge domain and narrow down large result sets using meaningful semantic criteria.

Context. Categories in DBpedia act as organizational hubs. However, they may contain thousands of heterogeneous entities. This case study demonstrates how hierarchical navigation and faceted filtering support sense-making.

Steps.

  1. The user opens a category page from the search results (e.g., a physics-related category).
  2. The system presents broader and narrower categories, enabling hierarchical navigation.
  3. The user inspects the Type Facets extension to see the most common rdf:type values among entities.
  4. The user selects one or more types and applies the filter to update the entity list.
  5. The user optionally uses the Category Map extension to visualize the local category neighborhood.
Category page showing hierarchical navigation (broader/narrower) and the Type Facets extension used for intelligent filtering.

Outcome. The user can progressively refine the explored knowledge domain, moving from a broad category to a focused subset of entities. Faceted filtering and layered visualization reduce complexity while preserving semantic meaning.

Case Study 3 — Entity Inspection and Discovery of Related Knowledge

User goal. Understand an individual entity in context and discover related entities based on shared semantic characteristics.

Context. Individual entities in DBpedia often have rich semantic links to categories and other entities. Presenting this information in a structured and visual manner supports comparison and deeper understanding.

Steps.

  1. The user opens an entity page from a category or search result.
  2. The system displays the entity’s label, abstract description, categories, and rdf:type assertions.
  3. The user inspects the Related Entities extension to discover semantically similar entities.
  4. The user shares the entity page using the generated URL or QR code.
Entity page presenting descriptive information, semantic categories, and rdf:type assertions.

Outcome. The user develops a contextual understanding of the entity and can easily transition to related knowledge. Semantic relationships are exposed in a human-readable form, while sharing features support reuse beyond the application.

12. Conclusion

DBpedia Explorer (WADe) demonstrates how a modular SPA can expose Semantic Web datasets through human-friendly exploration and reusable extensions. The system is service-oriented, uses SPARQL and RDF-based knowledge models, provides machine-readable annotations, and is deployed using modern Web engineering practices.