On this page

Introduction

Graph and Cypher in BangDB is quite powerful and allows users to deal with modern and complex use cases. BangDB natively integrates Graph with Stream, which makes in possible to ingest data in stream and keep growing Graph as well. With native AI integration, the data science becomes natural element for Graph. With simple Cypher queries, user can do much more and in real-time for several use cases

Data in a graph table for BangDB is defined as triples. A triple contains subject, object and relationship (predicate) between them. All data is stored as triple within the DB. BangDB does clever arrangements and housekeeping to store the data such that various queries can be written and run efficiently.

The structure of the query is very similar to “Cypher”. BangDB uses Cypher-like queries to process the data. The basic structures look like following

CREATE () -[]-> ()                                                                                                – for creating node or triple

S=>() -[]-> ()                                                                                                        – for querying data

<op USING attr1 SORT_DESC attr2 LIMIT n>    query1   ++   query2        – operation on disjoint sets of queries

The ‘()’ denotes subject or object and ‘[]’ denotes relation (predicate) with ‘->’ defining the direction. The arrangement is always “subject Predicate Object”.

The node has a label associated with it. Every node is written as “label:name”.

There are basically following keywords associated with all the queries.

Node, entity creation

CREATE – to create a single node, or triple

Running query and selecting data

S=>                  – namespace for the unit of query

RETURN          – selecting attributes for any query

WHERE            – conditions for the query

AS                     – selecting columns/attributes with alias

DATAQUERY   – for filtering within node and relations for properties

SORT_DESC    – for sorting in descending order

SORT_ASC       – for sorting in ascending order

LIMIT               – for limiting number of selections

Statistics

COUNT     – counting all using COUNT(*) or COUNT(A.col)

UCOUNT  – unique counting

AVG          – average of any attribute

MIN          – min value

MAX         – max value

STD          – standard deviation

SUM         – sum

EXKURT    – ex-kurtosis

SKEW       – skewness

Functional properties

SYMM        – symmetric relations

ASYMM     – asymmetric relations

Graph algos

ALL_PATH        – all paths between any two given nodes

SHORT_PATH  – shortest path between any two given nodes

Set operations

ADD             – adding two or more sets ( UNION )

CROSS         – cross product of two sets ( INTERSECT )

SUBTRACT  – difference of two sets    ( DIFFERENCE )

PIPE             – for piping (or sending) the first list to the second query

Data Science

SIMILARITY                           – compute similarities among set of nodes based on various data

CLUSTER                               – to find and natural clusters

CENTRALITY                         – finding the node centrality

COMMUNITY_DETECTION – for detecting several communities within graph

GROUPS                               – finding several groups given properties

ML_ALGO                             – this brings entire ML algorithms to the Graph, model name is supplied as well

Deep Learning*                  – DNN, RNN, ResNet. Embeddable within graph

Information Extraction*   – Ontologies or triple generation through IE

Data is processed from left to right. There could be several triples chained to form a query, like.

S1=>() -[]-> () -[]-> () …

Here in the above example, the first triple will intermediate-output a set of results, these intermediate-output will become input from subsequent processing etc. Therefore, it will keep evaluating from left to right using the intermediate results. The subject for subsequent chained query will be the intermediate result of the previous triple and so on.

In some cases, we would like to keep subject of the first triple as subject for the subsequent triple, then we can use the structure like following. This in contrast with the chain query, where object of the first triple becomes the subject of the second one and so on.

S2=>[S1=>() -[]-> ()] -[]-> () …

We will see the examples for these in subsequent sections

We will use BangDB CLI to perform these exercises. But before we go there, let’s see how BangDB Cypher is different from the original Cypher

Checkout a sample use cases here to learn bit more about Graph and Cypher in BangDB

Checkout the graph document here

Was this article helpful to you? Yes No