SlideShare a Scribd company logo
1 of 14
Benchmarking of distributed linked data streaming systems
This project has received funding from the European Union's H2020 research and innovation action program under grant agreement number 688227.
The project runtime is December 2015 until November 2018.
The HOBBIT project
Pavel Smirnov
AGT International
1
Stream Reasoning Workshop
January 17, 2018
2
Overview
• The HOBBIT project
• DEBS challenges
• Available benchmarks overview
• Summary
Goal
To abolish the barriers in the adoption and deployment of Big Linked Data by European companies by:
• The deployment of benchmarks on data that reflects reality within realistic settings.
• The provision of corresponding industry-relevant key performance indicators (KPIs).
• The computation of comparable results on standardized hardware.
• The institution of an independent and thus bias-free organization to conduct regular benchmarks and
provide the European industry with up-to-date performance results.
Deliverables:
• The benchmarking platform (the HOBBIT platform)
• The set of benchmarks with KPIs
• Benchmarking association
3
The HOBBIT project. Overview
http://project-hobbit.eu
4
The HOBBIT platform. Business logic
1
2
3.
2
3.
1
4
5
6
Customer
Requires ranking of alternative
solutions by some KPI
Solution provider (vendor)
(e.g. DB, Streaming Platforms, ML
frameworks, etc…)
The HOBBIT platform
(online or local instance)
Customer
Requires ranking of alternative
solutions by some KPI
Customer
Requires ranking of alternative
solutions by some KPI Provides:
1. Automatic benchmark executions
2. Leaderboards (online or private)
Main advantages:
1. Streaming fashion
2. Docker virtualization
3. RDF-enabled
Submit
benchmarks
Submit
systems
http://github.com/hobbit-project/platform
5
The HOBBIT platform. Architecture
The data pipeline:
1. Raw/initial data send (optional)
2. Sending raw tuples
3.1 Sending tasks (task={tuple, id})
3.2 Sending expected results per tasks
4. Send actual results per tasks
5. Send the “expected-actual” pairs
6. Send KPIs back to the controller
7. Send KPIs back to the platform
Benchmark (customer’s application)
System components
(black box for customers)
Platform components
1
2
3.1
3.2
4
5
6
The online platform:
http://master.project-hobbit.eu/
Cluster: 6 nodes, each is
2×64 bit Intel Xeon E5-2630v3
(8-Cores, 2.4 GHz, HT, 20MB
Cache, each proc.), 256 GB RAM,
1Gb Ethernet
Nodes (benchmark/system): 3/3
https://github.com/hobbit-project/platform/wiki/Overview
7
http://github.com/hobbit-project/platform
6
The HOBBIT platform. Technologies
https://github.com/hobbit-project/platform/wiki/Overview
Platform communication channel (RarritMQ only)
Data transportation channel (app-specific)
Platform-side:
1. Java
2. RabbitMQ
3. Docker+Swarm
4. GitLab
5. Redis
6. Virtuoso (RDF)
7. NodeJS
8. KeyCloak
App-side (defaults):
1. Java
2. RabbitMQ
Application side Platform side
(RabbitMQ, Kafka, Netty, Akka…)
http://github.com/hobbit-project/platform
Design and upload to HOBBIT
Create a project at
https://git.project-hobbit.eu
Create and account at
https://master.project-hobbit.eu
Clone and extend the basic codes:
https://github.com/hobbit-project/java-sdk-
example
Design components using the manuals:
Run tests locally as pure java code
Update ttl-files for you project
Upload Design (alternative using the JAVA SDK)
Develop a benchmark component in Java
Develop a component in Java
Develop a system adapter
Develop a system adapter in Java
Create docker files using details (manual)
Design (the standard HOBBIT way)
Debug Docker images by running tests
Find your benchmark or system at
https://master.project-hobbit.eu
Build images (manual)
Configure remote project details
Upload docker images to
https://git.project-hobbit.eu
- Lots of understanding and manual work
- Impossible to debug locally *
- Upload non-tested images *
- No logs from the online platform, only GUI *
+ Clone and extend standard classes with your logic
+ Test and debug your code from IDE
+ Built Docker images on demand from IDE
+ Run your images from IDE, check all internal logs
+ Upload fully tested images
7
* Unless you haven’t a local HOBBIT deployment
8
Example: single benchmark run
http://master.project-hobbit.eu/
9
Example: challenges & leaderboards
http://master.project-hobbit.eu/
Challenges: DEBS GC 2017
DEBS Grand Challenge 2017 successfully completed
Anomaly detection for injection molding machines over RDF-streams.
10
14 teams
registered
7 teams passed
correctness check
2 were awarded
(main and audience
award)
StreaML Open Challenge is opened; Price: 500 €
The main result:
For the first time we can objectively quantify the performance of
a distributed stream processing pipeline running analytics algorithms
https://project-hobbit.eu/challenges/debs-grand-challenge/
https://project-hobbit.eu/open-challenges/streaml-open-challenge/
Find Cluster
Centers Over W
time units
Apply Markov
Model for
Anomaly Detection
Train Markov
Model over last W
time units
start
After at least W
time units
The anomaly detector:
Challenges: DEBS GC 2018
DEBS Grand Challenge 2018 is just started
https://project-hobbit.eu/challenges/debs2018-grand-challenge/
Prediction of arrival times and ports on marine traffic data.
Price: 1000 € + publication at DEBS proceedings (conf. will be in New Zealand)
11
• Synthetic generated data
• Predefined algorithms
• True RDF-streaming benchmark
• Focus: correctness check,
throughput, latency
• Real annotated data
• No predefined approach
• True ML-benchmark
• Focus: prediction accuracy,
performance
DEBS Grand Challenge 2018DEBS Grand Challenge 2017
12
Available benchmarks overview
Versioning Benchmark
• Benchmark for assessing an ability of
versioning systems to efficiently
manage evolving datasets and queries
Data Storage Benchmark
 benchmark for RDF data storage
solutions against an interactive
workload in a real-world scenario, using
various dataset sizes
Linking Benchmark
 Benchmark for assessing the
performance of instance Matching
tools that implement string-based
approaches
Faceted Browsing Benchmark
• Benchmark for systems which support
browsing through linked data by
iterative transitions performed by an
intelligent user
ODIN Benchmark
• benchmark for data extraction
solutions for structured data
• simulates the ingestion, storage
and retrieval of streams of RDF
data
Spatial Benchmark
 Benchmark for systems which deal with
topological relations proposed in the
state of the art DE-9IM model.
Question Answering Benchmark
• Benchmark for ranking question
answering systems based on their
performance and accuracy
GERBIL Benchmark
• benchmark for entity annotation
and disambiguation tools
• 9 annotators, 11 RDF datasets
Stream Machine Learning Benchmark
 Benchmark for assess the performance of
anomaly detection for injection molding
machines over RDF-streams
Stream Machine Learning Benchmark v2
• Benchmark for assess the accuracy of
prediction over stream of marine traffic
data
http://github.com/hobbit-project
Summary
The HOBBIT platform
• Ability to benchmark heterogeneous distibuted systems in streaming fashion
• A set of benchmarks to compare relevant Linked Data technologies and solutions
• We apply the HOBBIT platform to rank machine-learning pipelines over the RDF-streams
• The platform may be a basics for benchmark of stream-reasoning solutions
13
QA
Thank you for attention!
14
psmirnov@agtinternational.com
http://twitter.com/smirnp
http://twitter.com/AGTIntl

More Related Content

What's hot

OSLC & The Future of Interoperability
OSLC & The Future of InteroperabilityOSLC & The Future of Interoperability
OSLC & The Future of InteroperabilityKoneksys
 
DEEP general presentation
DEEP general presentationDEEP general presentation
DEEP general presentationEUDAT
 
Updates from Hungary (Jozsef Kovacs)
Updates from Hungary (Jozsef Kovacs)Updates from Hungary (Jozsef Kovacs)
Updates from Hungary (Jozsef Kovacs)EOSC-hub project
 
Big Data Europe Transport Pilot case, Luigi Selmi
Big Data Europe Transport Pilot case, Luigi SelmiBig Data Europe Transport Pilot case, Luigi Selmi
Big Data Europe Transport Pilot case, Luigi SelmiBigData_Europe
 
The path to an hybrid open source paradigm
The path to an hybrid open source paradigmThe path to an hybrid open source paradigm
The path to an hybrid open source paradigmJonathan Challener
 
h5web: a web-based viewer of HDF5 files
h5web: a web-based viewer of HDF5 filesh5web: a web-based viewer of HDF5 files
h5web: a web-based viewer of HDF5 filesPaNOSC
 
Open DMPs: Machine Actionable open data management planning (Presentation at ...
Open DMPs: Machine Actionable open data management planning (Presentation at ...Open DMPs: Machine Actionable open data management planning (Presentation at ...
Open DMPs: Machine Actionable open data management planning (Presentation at ...OpenAIRE
 
LDBC 6th TUC Meeting conclusions by Peter Boncz
LDBC 6th TUC Meeting conclusions by Peter BonczLDBC 6th TUC Meeting conclusions by Peter Boncz
LDBC 6th TUC Meeting conclusions by Peter BonczIoan Toma
 
20141030 LinDA Workshop echallenges2014 - LinDA project overview
20141030 LinDA Workshop echallenges2014 - LinDA project overview20141030 LinDA Workshop echallenges2014 - LinDA project overview
20141030 LinDA Workshop echallenges2014 - LinDA project overviewLinDa_FP7
 
Enabling the digital thread using open OSLC standards
Enabling the digital thread using open OSLC standardsEnabling the digital thread using open OSLC standards
Enabling the digital thread using open OSLC standardsAxel Reichwein
 
Initiative Based Technology Consulting Case Studies
Initiative Based Technology Consulting Case StudiesInitiative Based Technology Consulting Case Studies
Initiative Based Technology Consulting Case Studieschanderdw
 
ECPPM2016 - SemCat: Publishing and Accessing Building Product Information as ...
ECPPM2016 - SemCat: Publishing and Accessing Building Product Information as ...ECPPM2016 - SemCat: Publishing and Accessing Building Product Information as ...
ECPPM2016 - SemCat: Publishing and Accessing Building Product Information as ...Pieter Pauwels
 
Data Processing and Analysis
Data Processing and AnalysisData Processing and Analysis
Data Processing and AnalysisEUDAT
 
DSD-NL 2021 Delft-FEWS visie 2025 en roadmap 2021 - stand van zaken en voorui...
DSD-NL 2021 Delft-FEWS visie 2025 en roadmap 2021 - stand van zaken en voorui...DSD-NL 2021 Delft-FEWS visie 2025 en roadmap 2021 - stand van zaken en voorui...
DSD-NL 2021 Delft-FEWS visie 2025 en roadmap 2021 - stand van zaken en voorui...Deltares
 
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data ApplicationsBig Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data ApplicationsBigData_Europe
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkSebastian Hellmann
 

What's hot (19)

OSLC & The Future of Interoperability
OSLC & The Future of InteroperabilityOSLC & The Future of Interoperability
OSLC & The Future of Interoperability
 
DEEP general presentation
DEEP general presentationDEEP general presentation
DEEP general presentation
 
Updates from Hungary (Jozsef Kovacs)
Updates from Hungary (Jozsef Kovacs)Updates from Hungary (Jozsef Kovacs)
Updates from Hungary (Jozsef Kovacs)
 
Big Data Europe Transport Pilot case, Luigi Selmi
Big Data Europe Transport Pilot case, Luigi SelmiBig Data Europe Transport Pilot case, Luigi Selmi
Big Data Europe Transport Pilot case, Luigi Selmi
 
The path to an hybrid open source paradigm
The path to an hybrid open source paradigmThe path to an hybrid open source paradigm
The path to an hybrid open source paradigm
 
Cartogrammar Poster
Cartogrammar PosterCartogrammar Poster
Cartogrammar Poster
 
h5web: a web-based viewer of HDF5 files
h5web: a web-based viewer of HDF5 filesh5web: a web-based viewer of HDF5 files
h5web: a web-based viewer of HDF5 files
 
Open DMPs: Machine Actionable open data management planning (Presentation at ...
Open DMPs: Machine Actionable open data management planning (Presentation at ...Open DMPs: Machine Actionable open data management planning (Presentation at ...
Open DMPs: Machine Actionable open data management planning (Presentation at ...
 
LDBC 6th TUC Meeting conclusions by Peter Boncz
LDBC 6th TUC Meeting conclusions by Peter BonczLDBC 6th TUC Meeting conclusions by Peter Boncz
LDBC 6th TUC Meeting conclusions by Peter Boncz
 
20141030 LinDA Workshop echallenges2014 - LinDA project overview
20141030 LinDA Workshop echallenges2014 - LinDA project overview20141030 LinDA Workshop echallenges2014 - LinDA project overview
20141030 LinDA Workshop echallenges2014 - LinDA project overview
 
Deep Hybrid DataCloud
Deep Hybrid DataCloudDeep Hybrid DataCloud
Deep Hybrid DataCloud
 
Enabling the digital thread using open OSLC standards
Enabling the digital thread using open OSLC standardsEnabling the digital thread using open OSLC standards
Enabling the digital thread using open OSLC standards
 
Initiative Based Technology Consulting Case Studies
Initiative Based Technology Consulting Case StudiesInitiative Based Technology Consulting Case Studies
Initiative Based Technology Consulting Case Studies
 
Planetdata simpda
Planetdata simpdaPlanetdata simpda
Planetdata simpda
 
ECPPM2016 - SemCat: Publishing and Accessing Building Product Information as ...
ECPPM2016 - SemCat: Publishing and Accessing Building Product Information as ...ECPPM2016 - SemCat: Publishing and Accessing Building Product Information as ...
ECPPM2016 - SemCat: Publishing and Accessing Building Product Information as ...
 
Data Processing and Analysis
Data Processing and AnalysisData Processing and Analysis
Data Processing and Analysis
 
DSD-NL 2021 Delft-FEWS visie 2025 en roadmap 2021 - stand van zaken en voorui...
DSD-NL 2021 Delft-FEWS visie 2025 en roadmap 2021 - stand van zaken en voorui...DSD-NL 2021 Delft-FEWS visie 2025 en roadmap 2021 - stand van zaken en voorui...
DSD-NL 2021 Delft-FEWS visie 2025 en roadmap 2021 - stand van zaken en voorui...
 
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data ApplicationsBig Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
 

Similar to Benchmarking of distributed linked data streaming systems

IoT Physical Servers and Cloud Offerings.pdf
IoT Physical Servers and Cloud Offerings.pdfIoT Physical Servers and Cloud Offerings.pdf
IoT Physical Servers and Cloud Offerings.pdfGVNSK Sravya
 
HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board  HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board Helix Nebula The Science Cloud
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overviewBigData_Europe
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...BigData_Europe
 
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Big Data Value Association
 
WSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product OverviewWSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product OverviewWSO2
 
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...Big Data Value Association
 
EOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub project
 
Evolutionary evnt-driven-architecture-for-accelerated-digital-transformation
Evolutionary evnt-driven-architecture-for-accelerated-digital-transformationEvolutionary evnt-driven-architecture-for-accelerated-digital-transformation
Evolutionary evnt-driven-architecture-for-accelerated-digital-transformationSlobodan Sipcic
 
BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigData_Europe
 
Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011Dublinked .
 
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 DataBench
 
CPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based ToolboxCPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based ToolboxStephan Haller
 
VoltDB and Flytxt Present: Building a Single Technology Platform for Real-Tim...
VoltDB and Flytxt Present: Building a Single Technology Platform for Real-Tim...VoltDB and Flytxt Present: Building a Single Technology Platform for Real-Tim...
VoltDB and Flytxt Present: Building a Single Technology Platform for Real-Tim...VoltDB
 
Aws based digital_transformation_platform
Aws based digital_transformation_platformAws based digital_transformation_platform
Aws based digital_transformation_platformSlobodan Sipcic
 
Big Data projects.pdf
Big Data projects.pdfBig Data projects.pdf
Big Data projects.pdfssuserf0a206
 
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...DataWorks Summit
 
Easy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance ProfessionalEasy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance ProfessionalMartin Kaltenböck
 
fiware_event_13_06_2023_fragkou_pavlina.pptx
fiware_event_13_06_2023_fragkou_pavlina.pptxfiware_event_13_06_2023_fragkou_pavlina.pptx
fiware_event_13_06_2023_fragkou_pavlina.pptxFIWARE
 

Similar to Benchmarking of distributed linked data streaming systems (20)

IoT Physical Servers and Cloud Offerings.pdf
IoT Physical Servers and Cloud Offerings.pdfIoT Physical Servers and Cloud Offerings.pdf
IoT Physical Servers and Cloud Offerings.pdf
 
HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board  HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overview
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
 
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
 
WSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product OverviewWSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product Overview
 
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
 
EOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub service portfolio
EOSC-hub service portfolio
 
Evolutionary evnt-driven-architecture-for-accelerated-digital-transformation
Evolutionary evnt-driven-architecture-for-accelerated-digital-transformationEvolutionary evnt-driven-architecture-for-accelerated-digital-transformation
Evolutionary evnt-driven-architecture-for-accelerated-digital-transformation
 
HNSciCloud Overview
HNSciCloud Overview HNSciCloud Overview
HNSciCloud Overview
 
BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE Platform
 
Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011
 
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
 
CPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based ToolboxCPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based Toolbox
 
VoltDB and Flytxt Present: Building a Single Technology Platform for Real-Tim...
VoltDB and Flytxt Present: Building a Single Technology Platform for Real-Tim...VoltDB and Flytxt Present: Building a Single Technology Platform for Real-Tim...
VoltDB and Flytxt Present: Building a Single Technology Platform for Real-Tim...
 
Aws based digital_transformation_platform
Aws based digital_transformation_platformAws based digital_transformation_platform
Aws based digital_transformation_platform
 
Big Data projects.pdf
Big Data projects.pdfBig Data projects.pdf
Big Data projects.pdf
 
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
Data Acquisition Automation for NiFi in a Hybrid Cloud environment – the Path...
 
Easy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance ProfessionalEasy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance Professional
 
fiware_event_13_06_2023_fragkou_pavlina.pptx
fiware_event_13_06_2023_fragkou_pavlina.pptxfiware_event_13_06_2023_fragkou_pavlina.pptx
fiware_event_13_06_2023_fragkou_pavlina.pptx
 

More from Holistic Benchmarking of Big Linked Data

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...Holistic Benchmarking of Big Linked Data
 
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...Holistic Benchmarking of Big Linked Data
 
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...Holistic Benchmarking of Big Linked Data
 
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
 Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F... Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...Holistic Benchmarking of Big Linked Data
 
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation CampaignIntroducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation CampaignHolistic Benchmarking of Big Linked Data
 

More from Holistic Benchmarking of Big Linked Data (20)

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
 
Benchmarking Big Linked Data: The case of the HOBBIT Project
Benchmarking Big Linked Data: The case of the HOBBIT ProjectBenchmarking Big Linked Data: The case of the HOBBIT Project
Benchmarking Big Linked Data: The case of the HOBBIT Project
 
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
 
The DEBS Grand Challenge 2018
The DEBS Grand Challenge 2018The DEBS Grand Challenge 2018
The DEBS Grand Challenge 2018
 
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation FrameworkSQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
 
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federationLargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
 
The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017
 
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
 
Scalable Link Discovery for Modern Data-Driven Applications (poster)
Scalable Link Discovery for Modern Data-Driven Applications (poster)Scalable Link Discovery for Modern Data-Driven Applications (poster)
Scalable Link Discovery for Modern Data-Driven Applications (poster)
 
An Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link DiscoveryAn Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link Discovery
 
Scalable Link Discovery for Modern Data-Driven Applications
Scalable Link Discovery for Modern Data-Driven ApplicationsScalable Link Discovery for Modern Data-Driven Applications
Scalable Link Discovery for Modern Data-Driven Applications
 
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
 Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F... Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
 
SPgen: A Benchmark Generator for Spatial Link Discovery Tools
SPgen: A Benchmark Generator for Spatial Link Discovery ToolsSPgen: A Benchmark Generator for Spatial Link Discovery Tools
SPgen: A Benchmark Generator for Spatial Link Discovery Tools
 
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation CampaignIntroducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
 
OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018
 
MOCHA 2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018MOCHA 2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018
 
Dynamic planning for link discovery - ESWC 2018
Dynamic planning for link discovery - ESWC 2018Dynamic planning for link discovery - ESWC 2018
Dynamic planning for link discovery - ESWC 2018
 
Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017
 
Leopard ISWC Semantic Web Challenge 2017 (poster)
Leopard ISWC Semantic Web Challenge 2017 (poster)Leopard ISWC Semantic Web Challenge 2017 (poster)
Leopard ISWC Semantic Web Challenge 2017 (poster)
 
Leopard ISWC Semantic Web Challenge 2017
Leopard ISWC Semantic Web Challenge 2017Leopard ISWC Semantic Web Challenge 2017
Leopard ISWC Semantic Web Challenge 2017
 

Recently uploaded

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Benchmarking of distributed linked data streaming systems

  • 1. Benchmarking of distributed linked data streaming systems This project has received funding from the European Union's H2020 research and innovation action program under grant agreement number 688227. The project runtime is December 2015 until November 2018. The HOBBIT project Pavel Smirnov AGT International 1 Stream Reasoning Workshop January 17, 2018
  • 2. 2 Overview • The HOBBIT project • DEBS challenges • Available benchmarks overview • Summary
  • 3. Goal To abolish the barriers in the adoption and deployment of Big Linked Data by European companies by: • The deployment of benchmarks on data that reflects reality within realistic settings. • The provision of corresponding industry-relevant key performance indicators (KPIs). • The computation of comparable results on standardized hardware. • The institution of an independent and thus bias-free organization to conduct regular benchmarks and provide the European industry with up-to-date performance results. Deliverables: • The benchmarking platform (the HOBBIT platform) • The set of benchmarks with KPIs • Benchmarking association 3 The HOBBIT project. Overview http://project-hobbit.eu
  • 4. 4 The HOBBIT platform. Business logic 1 2 3. 2 3. 1 4 5 6 Customer Requires ranking of alternative solutions by some KPI Solution provider (vendor) (e.g. DB, Streaming Platforms, ML frameworks, etc…) The HOBBIT platform (online or local instance) Customer Requires ranking of alternative solutions by some KPI Customer Requires ranking of alternative solutions by some KPI Provides: 1. Automatic benchmark executions 2. Leaderboards (online or private) Main advantages: 1. Streaming fashion 2. Docker virtualization 3. RDF-enabled Submit benchmarks Submit systems http://github.com/hobbit-project/platform
  • 5. 5 The HOBBIT platform. Architecture The data pipeline: 1. Raw/initial data send (optional) 2. Sending raw tuples 3.1 Sending tasks (task={tuple, id}) 3.2 Sending expected results per tasks 4. Send actual results per tasks 5. Send the “expected-actual” pairs 6. Send KPIs back to the controller 7. Send KPIs back to the platform Benchmark (customer’s application) System components (black box for customers) Platform components 1 2 3.1 3.2 4 5 6 The online platform: http://master.project-hobbit.eu/ Cluster: 6 nodes, each is 2×64 bit Intel Xeon E5-2630v3 (8-Cores, 2.4 GHz, HT, 20MB Cache, each proc.), 256 GB RAM, 1Gb Ethernet Nodes (benchmark/system): 3/3 https://github.com/hobbit-project/platform/wiki/Overview 7 http://github.com/hobbit-project/platform
  • 6. 6 The HOBBIT platform. Technologies https://github.com/hobbit-project/platform/wiki/Overview Platform communication channel (RarritMQ only) Data transportation channel (app-specific) Platform-side: 1. Java 2. RabbitMQ 3. Docker+Swarm 4. GitLab 5. Redis 6. Virtuoso (RDF) 7. NodeJS 8. KeyCloak App-side (defaults): 1. Java 2. RabbitMQ Application side Platform side (RabbitMQ, Kafka, Netty, Akka…) http://github.com/hobbit-project/platform
  • 7. Design and upload to HOBBIT Create a project at https://git.project-hobbit.eu Create and account at https://master.project-hobbit.eu Clone and extend the basic codes: https://github.com/hobbit-project/java-sdk- example Design components using the manuals: Run tests locally as pure java code Update ttl-files for you project Upload Design (alternative using the JAVA SDK) Develop a benchmark component in Java Develop a component in Java Develop a system adapter Develop a system adapter in Java Create docker files using details (manual) Design (the standard HOBBIT way) Debug Docker images by running tests Find your benchmark or system at https://master.project-hobbit.eu Build images (manual) Configure remote project details Upload docker images to https://git.project-hobbit.eu - Lots of understanding and manual work - Impossible to debug locally * - Upload non-tested images * - No logs from the online platform, only GUI * + Clone and extend standard classes with your logic + Test and debug your code from IDE + Built Docker images on demand from IDE + Run your images from IDE, check all internal logs + Upload fully tested images 7 * Unless you haven’t a local HOBBIT deployment
  • 8. 8 Example: single benchmark run http://master.project-hobbit.eu/
  • 9. 9 Example: challenges & leaderboards http://master.project-hobbit.eu/
  • 10. Challenges: DEBS GC 2017 DEBS Grand Challenge 2017 successfully completed Anomaly detection for injection molding machines over RDF-streams. 10 14 teams registered 7 teams passed correctness check 2 were awarded (main and audience award) StreaML Open Challenge is opened; Price: 500 € The main result: For the first time we can objectively quantify the performance of a distributed stream processing pipeline running analytics algorithms https://project-hobbit.eu/challenges/debs-grand-challenge/ https://project-hobbit.eu/open-challenges/streaml-open-challenge/ Find Cluster Centers Over W time units Apply Markov Model for Anomaly Detection Train Markov Model over last W time units start After at least W time units The anomaly detector:
  • 11. Challenges: DEBS GC 2018 DEBS Grand Challenge 2018 is just started https://project-hobbit.eu/challenges/debs2018-grand-challenge/ Prediction of arrival times and ports on marine traffic data. Price: 1000 € + publication at DEBS proceedings (conf. will be in New Zealand) 11 • Synthetic generated data • Predefined algorithms • True RDF-streaming benchmark • Focus: correctness check, throughput, latency • Real annotated data • No predefined approach • True ML-benchmark • Focus: prediction accuracy, performance DEBS Grand Challenge 2018DEBS Grand Challenge 2017
  • 12. 12 Available benchmarks overview Versioning Benchmark • Benchmark for assessing an ability of versioning systems to efficiently manage evolving datasets and queries Data Storage Benchmark  benchmark for RDF data storage solutions against an interactive workload in a real-world scenario, using various dataset sizes Linking Benchmark  Benchmark for assessing the performance of instance Matching tools that implement string-based approaches Faceted Browsing Benchmark • Benchmark for systems which support browsing through linked data by iterative transitions performed by an intelligent user ODIN Benchmark • benchmark for data extraction solutions for structured data • simulates the ingestion, storage and retrieval of streams of RDF data Spatial Benchmark  Benchmark for systems which deal with topological relations proposed in the state of the art DE-9IM model. Question Answering Benchmark • Benchmark for ranking question answering systems based on their performance and accuracy GERBIL Benchmark • benchmark for entity annotation and disambiguation tools • 9 annotators, 11 RDF datasets Stream Machine Learning Benchmark  Benchmark for assess the performance of anomaly detection for injection molding machines over RDF-streams Stream Machine Learning Benchmark v2 • Benchmark for assess the accuracy of prediction over stream of marine traffic data http://github.com/hobbit-project
  • 13. Summary The HOBBIT platform • Ability to benchmark heterogeneous distibuted systems in streaming fashion • A set of benchmarks to compare relevant Linked Data technologies and solutions • We apply the HOBBIT platform to rank machine-learning pipelines over the RDF-streams • The platform may be a basics for benchmark of stream-reasoning solutions 13
  • 14. QA Thank you for attention! 14 psmirnov@agtinternational.com http://twitter.com/smirnp http://twitter.com/AGTIntl