BangDB ML Helper API (Client)
Client API
BangDB ML Helper offers several APIs to help simplify the ML related activities. The type offers features from Training model, prediction, versioning of model, deployment to managing large files and binary objects related to ML.
Check out the few real world examples for to learn more or try them out on BangDB
C++
Java
To create mlhelper object
BangDB MLHelper(train_pred_brs_info *tpbinfo, const char *conf_path = NULL, bool isssl = true)To create a bucket to store all intermediate training and testing files.
int createBucket(const char *bucket_info)Bucket_info is the name for the bucket to be created
It returns -1 for error
To create or to change name of the bucket
void setBucket(const char *bucket_info)To upload the files required to train or predict.
long uploadFile(const char *key, const char *fpath, InsertOptions iop)The key is the id of the file fpath takes the path to the file including the file name.
InsertOptions is a enum with values:
INSERT_UNIQUE, //if non-existing then insert else return
UPDATE_EXISTING, //if existing then update else return
INSERT_UPDATE, //insert if non-existing else update
DELETE_EXISTING, //delete if existing
UPDATE_EXISTING_INPLACE, //only for inplace update
INSERT_UPDATE_INPLACE, //only for inplace update
Please see more on this at bangdb common INSERT_UNIQUE, //if non-existing then insert else return
UPDATE_EXISTING, //if existing then update else return
INSERT_UPDATE, //insert if non-existing else update
DELETE_EXISTING, //delete if existing
UPDATE_EXISTING_INPLACE, //only for inplace update
INSERT_UPDATE_INPLACE, //only for inplace update
It returns -1 for error
This is to train a model we should call trainModel API. This API returns immediately and if successful then it schedules training of the model. User should call getModelStaus() for sometime until it returns the end status
int trainModel(const char *req)It takes a training request and returns status of the training request
It returns -1 for error
To get status of the model when training request is fired
char *getModelStatus(const char *req)Req input parameter is like following;
req = {"schema-name":, "model_name": }And the return value is like following;
{"schema-name":, "model_name":, "train_start_ts":, "train_end_ts":, "train_state":,}
ML_BANGDB_TRAINING_STATE is an enum with values
//error
ML_BANGDB_TRAINING_STATE_INVALID_INPUT = 10,
ML_BANGDB_TRAINING_STATE_NOT_PRSENT,
ML_BANGDB_TRAINING_STATE_ERROR_PARSE,
ML_BANGDB_TRAINING_STATE_ERROR_FORMAT,
ML_BANGDB_TRAINING_STATE_ERROR_BRS,
ML_BANGDB_TRAINING_STATE_ERROR_TUNE,
ML_BANGDB_TRAINING_STATE_ERROR_TRAIN,
ML_FILE_TYPE_ERROR_VAL_TESTDATA,
ML_FILE_TYPE_ERROR_VAL_TRAINDATA,
ML_BANGDB_TRAINING_STATE_LIMBO,
//intermediate states
ML_BANGDB_TRAINING_STATE_BRS_GET_PENDING,
ML_BANGDB_TRAINING_STATE_BRS_GET_DONE,
ML_BANGDB_TRAINING_STATE_REFORMAT_DONE,
ML_BANGDB_TRAINING_STATE_SCALE_TUNING_DONE,
ML_BANGDB_TRAINING_STATE_BRS_MODEL_UPLOAD_PENDING,
//training done
ML_BANGDB_TRAINING_STATE_TRAINING_DONE, //25
ML_BANGDB_TRAINING_STATE_DEPRICATED
The above is true for ML related model status. //error
ML_BANGDB_TRAINING_STATE_INVALID_INPUT = 10,
ML_BANGDB_TRAINING_STATE_NOT_PRSENT,
ML_BANGDB_TRAINING_STATE_ERROR_PARSE,
ML_BANGDB_TRAINING_STATE_ERROR_FORMAT,
ML_BANGDB_TRAINING_STATE_ERROR_BRS,
ML_BANGDB_TRAINING_STATE_ERROR_TUNE,
ML_BANGDB_TRAINING_STATE_ERROR_TRAIN,
ML_FILE_TYPE_ERROR_VAL_TESTDATA,
ML_FILE_TYPE_ERROR_VAL_TRAINDATA,
ML_BANGDB_TRAINING_STATE_LIMBO,
//intermediate states
ML_BANGDB_TRAINING_STATE_BRS_GET_PENDING,
ML_BANGDB_TRAINING_STATE_BRS_GET_DONE,
ML_BANGDB_TRAINING_STATE_REFORMAT_DONE,
ML_BANGDB_TRAINING_STATE_SCALE_TUNING_DONE,
ML_BANGDB_TRAINING_STATE_BRS_MODEL_UPLOAD_PENDING,
//training done
ML_BANGDB_TRAINING_STATE_TRAINING_DONE, //25
ML_BANGDB_TRAINING_STATE_DEPRICATED
For IE (Information Extraction) related model status use following;
IE_BANGDB_TRAINING_STATE i:s an enum with values
//error
IE_BANGDB_TRAINING_STATE_INVALID_INPUT = 10,
IE_BANGDB_TRAINING_STATE_NOT_PRSENT,
IE_BANGDB_TRAINING_STATE_ERROR_BRS,
IE_BANGDB_TRAINING_STATE_ERROR_HELPER_FILES,
IE_BANGDB_TRAINING_STATE_ERROR_BRS_FEATURE_EX,
IE_BANGDB_TRAINING_STATE_ERROR_BRS_HELP_FILES,
IE_BANGDB_TRAINING_STATE_ERROR_PRE_NER_TRAIN,
IE_BANGDB_TRAINING_STATE_LIMBO,
IE_BANGDB_TRAINING_STATE_ERROR_NER_TRAIN,
IE_BANGDB_TRAINING_STATE_ERROR_NER_TRAIN_BRS,
IE_BANGDB_TRAINING_STATE_ERROR_PRE_REL_TRAIN, //20
IE_BANGDB_TRAINING_STATE_ERROR_REL_TRAIN,
IE_BANGDB_TRAINING_STATE_ERROR_REL_TRAIN_BRS,
IE_BANGDB_TRAINING_STATE_ERROR_REL_LIST_BRS,
IE_FILE_TYPE_ERROR_VAL_TRAINDATA,
IE_FILE_TYPE_ERROR_VAL_TESTDATA,
IE_FILE_TYPE_ERROR_VAL_CLASSDATA,
IE_FILE_TYPE_ERROR_VAL_TOTALEXDATA,
//intermediate states
IE_BANGDB_TRAINING_STATE_BRS_GET_PENDING,
IE_BANGDB_TRAINING_STATE_BRS_GET_DONE,
IE_BANGDB_TRAINING_STATE_HELPER_DONE, //30
IE_BANGDB_TRAINING_STATE_PRE_NER_DONE,
IE_BANGDB_TRAINING_STATE_NER_DONE,
IE_BANGDB_TRAINING_STATE_PRE_REL_DONE,
IE_BANGDB_TRAINING_STATE_REL_DONE,
IE_BANGDB_TRAINING_STATE_BRS_MODEL_UPLOAD_PENDING,
IE_BANGDB_TRAINING_STATE_BRS_RELLIST_UPLOAD_PENDING,
//training done
IE_BANGDB_TRAINING_HELP_DONE, //37
IE_BANGDB_TRAINING_STATE_TRAINING_DONE, //38
IE_BANGDB_TRAINING_STATE_DEPRICATED
Please see more on this at bangdb common //error
IE_BANGDB_TRAINING_STATE_INVALID_INPUT = 10,
IE_BANGDB_TRAINING_STATE_NOT_PRSENT,
IE_BANGDB_TRAINING_STATE_ERROR_BRS,
IE_BANGDB_TRAINING_STATE_ERROR_HELPER_FILES,
IE_BANGDB_TRAINING_STATE_ERROR_BRS_FEATURE_EX,
IE_BANGDB_TRAINING_STATE_ERROR_BRS_HELP_FILES,
IE_BANGDB_TRAINING_STATE_ERROR_PRE_NER_TRAIN,
IE_BANGDB_TRAINING_STATE_LIMBO,
IE_BANGDB_TRAINING_STATE_ERROR_NER_TRAIN,
IE_BANGDB_TRAINING_STATE_ERROR_NER_TRAIN_BRS,
IE_BANGDB_TRAINING_STATE_ERROR_PRE_REL_TRAIN, //20
IE_BANGDB_TRAINING_STATE_ERROR_REL_TRAIN,
IE_BANGDB_TRAINING_STATE_ERROR_REL_TRAIN_BRS,
IE_BANGDB_TRAINING_STATE_ERROR_REL_LIST_BRS,
IE_FILE_TYPE_ERROR_VAL_TRAINDATA,
IE_FILE_TYPE_ERROR_VAL_TESTDATA,
IE_FILE_TYPE_ERROR_VAL_CLASSDATA,
IE_FILE_TYPE_ERROR_VAL_TOTALEXDATA,
//intermediate states
IE_BANGDB_TRAINING_STATE_BRS_GET_PENDING,
IE_BANGDB_TRAINING_STATE_BRS_GET_DONE,
IE_BANGDB_TRAINING_STATE_HELPER_DONE, //30
IE_BANGDB_TRAINING_STATE_PRE_NER_DONE,
IE_BANGDB_TRAINING_STATE_NER_DONE,
IE_BANGDB_TRAINING_STATE_PRE_REL_DONE,
IE_BANGDB_TRAINING_STATE_REL_DONE,
IE_BANGDB_TRAINING_STATE_BRS_MODEL_UPLOAD_PENDING,
IE_BANGDB_TRAINING_STATE_BRS_RELLIST_UPLOAD_PENDING,
//training done
IE_BANGDB_TRAINING_HELP_DONE, //37
IE_BANGDB_TRAINING_STATE_TRAINING_DONE, //38
IE_BANGDB_TRAINING_STATE_DEPRICATED
It returns NULL for error or errcode as -1, else errcode for success
User should free the memory using delete[]
To delete the mode
int delModel(const char *req)This delete model by passing req parameter. req = {“schema_name”:,”model_name”:}
It returns -1 for error
To delete training request
int delTrainRequest(const char *req)This is to delete the training request. Helpful when training got stuck for some reasons and the status was not updated properly.
It returns -1 for error
To predict for a particular data or event.
char *predict(const char *req)Here is how req looks like;
{schema-name, attr_type: NUM, data_type:event, re_format:N, model_name: model_name, data:"1 1:1.2 2:3.2 3:1.1"}
Here, attr_type is an enum with following values:
ML_BANGDB_ATTR_TYPE_INVALID = 0,
ML_BANGDB_ATTR_TYPE_NUM,
ML_BANGDB_ATTR_TYPE_STR,
ML_BANGDB_ATTR_TYPE_HYBRID,
Data_type is an enum with following values:
ML_PREDICT_DATA_TYPE_INVALID = 0,
ML_PREDICT_DATA_TYPE_FILE,
ML_PREDICT_DATA_TYPE_EVENT
re_format is also an enum with following values
ML_BANGDB_ML_DATA_FORMAT_LIBSVM = 0,
ML_BANGDB_ML_DATA_FORMAT_CSV,
ML_BANGDB_ML_DATA_FORMAT_ARFF,
ML_BANGDB_ML_DATA_FORMAT_JSON,
ML_BANGDB_ML_DATA_FORMAT_INVALID
Please see more on this at bangdb common
It returns NULL for error or errcode as -1 else errcode.ML_BANGDB_ATTR_TYPE_INVALID = 0,
ML_BANGDB_ATTR_TYPE_NUM,
ML_BANGDB_ATTR_TYPE_STR,
ML_BANGDB_ATTR_TYPE_HYBRID,
Data_type is an enum with following values:
ML_PREDICT_DATA_TYPE_INVALID = 0,
ML_PREDICT_DATA_TYPE_FILE,
ML_PREDICT_DATA_TYPE_EVENT
re_format is also an enum with following values
ML_BANGDB_ML_DATA_FORMAT_LIBSVM = 0,
ML_BANGDB_ML_DATA_FORMAT_CSV,
ML_BANGDB_ML_DATA_FORMAT_ARFF,
ML_BANGDB_ML_DATA_FORMAT_JSON,
ML_BANGDB_ML_DATA_FORMAT_INVALID
Please see more on this at bangdb common
User should free the memory using delete[]
To get to training request all all models for a particular schema
ResultSet *getTrainingRequests(const char *schema)It returns NULL for error code.
To get training request for a particular model
char *getRequest(const char *req) req : {“schema_name": ,"model_name": }It returns NULL for error or errcode as -1 else errcode.
User should free the memory using delete[]
This sets the status for a particular training request
int setModelStatus(const char *status) status = {“schema_name": ,"model_name": ,"status": }It returns -1 for error
To get prediction status
char *getModelPredStatus(const char *req) req = {"schema-name":, "model_name": }It returns NULL for error or errcode as -1 else errcode.
User should free the memory using delete[]
To delete prediction request
int delPredRequest(const char *req) req = {"schema-name":, "model_name": “file_name":} It returns 0 for success and -1 for errorIt returns -1 for error
To upload any ml related file
long uploadFile(const char *bucket_info, const char *key, const char *fpath, InsertOptions iop)Key is the id for the file and fpath takes the path to the file including the file name.
To Download a file from a given bucket
long downloadFile(const char *bucket_info, const char *key, const char *fname, const char *fpath)It returns -1 for error
To get the binary from the given buckets
long getObject(const char *bucket_info, const char *key, const char **data, long *datlen)It gets the object(binary or otherwise) from the given bucket, key.
It fills data with the object and sets the datlen as length or size of the object.
It returns -1 for error
To delete a file from a bucket
int delFile(const char *bucket_info, const char *key)It returns -1 for error
To delete a bucket
int delBucket(const char *bucket_info)It returns -1 for error
To count the number of buckets
long countBuckets()It returns -1 for error or count for success
To get number of slices are there for the given file
int countSlices(const char *bucket_info, const char *key)Since BRS (bangdb resource server) stores large files and objects in chunks, therefore we can count how many slices are there for the given file (key) by calling this function.
It returns -1 for error for count for success
To count object in a given bucket
long countObjects(const char *bucket_info)It returns -1 for error
To get details of all the objects in a given bucket
char *countObjectsDetails(const char *bucket_info)It returns NULL for error else the details.
User should free memory using delete[]
Count the number of models for a schema
long countModels(const char *schema)It returns -1 for error else count
To get list of objects for a given buckets
char *listObjects(const char *bucket_info, const char *key = NULL, int list_size_mb = 0)This returns json string with the list of objects in a given bucket for a given key or for all keys It returns NULL for error else the object list.
User should free the memory of returned data using delete[]
To get list of buckets present
char *listBuckets(const char *user_info)This returns the list of all buckets for the user given by user_info which looks like following;
{"access_key":"akey", "secret_key":"skey"}It returns NULL for error else the object list.
User should free the memory of returned data using delete[]
To get data from stream to train model
long uploadStreamDataForTrain(const char *req)It returns -1 for error
To closed the bangdb ml helper
void close BangDB ML Helper ()To delete mlhelper object
virtual ~ BangDB ML Helper()