Skip to content

this repository contains a docker with: - The data processing pipeline with an API ready for data reception and connected to virtuoso for publishing the prepared data. - An instance of virtuoso for persistently linked data storage. - A sparql endpoint ready to receive queries. - A simple web page explaining the status of the project, its steps, …

License

Notifications You must be signed in to change notification settings

nextprocurement/platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

platform

this repository contains a docker with:

  • The data processing pipeline with an API ready for data reception and connected to virtuoso for publishing the prepared data.
  • An instance of virtuoso for persistently linked data storage.
  • A sparql endpoint ready to receive queries.
  • A simple web page explaining the status of the project, its steps, some examples and offering a direct link to perform the sparql queries.

Basic Overview

Easy deployment of nextProcurement tools and services into a local environment.

Quick Start

  1. Install Docker and Docker-Compose

  2. Clone this repo

    git clone https://github.com/nextprocurement/platform.git
    
  3. Enter to the folder NextProc and run the platform by:

    cd platform/NextProc
    sudo docker-compose up --build -d
    

If there is a problem with virtuoso, you must add your folder to Docker file sharing and retry.

  1. Wait for all services to be available. The first time it may take a few minutes to download the Docker images.

    sudo docker-compose logs -f
    
  2. Process preprocess the parquet file (that must be placed in the data folder):

    sudo python3 ./pipeline/py/applyRules.py -i data -o out -r 'pipeline/py/rml-mappings/mapping.rml.ttl'
    

    If at any point a module is missing, install it using "sudo pip install moduleName" (there is also a requirement.txt file available).

  3. Initialize the RDF repository

    1. Go to the Virtuoso administration GUI, http://localhost:8890
    2. Create a new dataset tbfy following the instructions below:
      1. Go to Virtuoso conductor and log in (dba/dba)
      2. Go to the tab System Admin > User Accounts.
      3. Grant permission SPARQL_UPDATE to the user SPARQL
      4. Go to http://localhost:8890/sparql/ and create the tbfy graph querying the following:
      CREATE GRAPH <http://127.0.0.1:8890/tbfy>
      
  4. Upload the triples to the graph (depending on the size it may take time; timeout is set to 300s):

    sudo python3 ./pipeline/py/publish_rdf.py -i out
    
  5. That's all! The graph will be available at http://127.0.0.1:8890/tbfy, and you will be able to query it from http://localhost:8890/sparql/

Acknowledgements

This work has received funding from the NextProcurement European Action (grant agreement INEA/CEF/ICT/A2020/2373713-Action 2020-ES-IA-0255).

EU Logo Next Procurement Logo

About

this repository contains a docker with: - The data processing pipeline with an API ready for data reception and connected to virtuoso for publishing the prepared data. - An instance of virtuoso for persistently linked data storage. - A sparql endpoint ready to receive queries. - A simple web page explaining the status of the project, its steps, …

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •