Real-time information is still offered in vary heterogeneous formats, not necessary similar to RDF or RDF streams. In some real-time environments there are different data serialization formats for exchange real-time information, which in an integrated vision could be considered and could be queried in the same way to produce a unique and useful service able to describe the entire environment.
Since, there are no SPARQL extension able to query such heterogeneous formats simultaneously, without transforming the data in RDF, we want to provide a new SPARQL extension that is able to provide this behaviour. We have decided to realise a new SPARQL extension starting from two existing extensions: C-SPARQL and CQELS, and providing features to manage streams given in CSV format. We have decided to natively support CSV format because it is the most common one, it is widely supported, by consumer, business, and scientific applications, and it can be easily obtains from any other serialization formats.
Below it is presented the formal definition of our new SPARQL extension that we name DubExtensions since is based on both C-SPARQL and CQELS specification. In the figure below are expressed all features that we have added in order to extend the SPARQL grammar: part of the grammar corresponds to C-SPARQL and part to CQELS, however, we further extend their features to support CSV graph patterns and CSV input streams.
The example below represents a possible usage of a query DubExtensions, able to merge static RDF and stream CSV data stream. As one can see, the DubExtensions query provides the average of usage of bike for every possible station every 20 minutes, by selecting the static information about stations (“?station”) from a standard Basic Graph Pattern, and by counting the number of bikes available (“?bikecount”) from the CSV data stream.