PolarDB's cloud-native computing-storage separation architecture features low-cost data storage, high-efficiency scalability, high-speed multi-machine parallel computing capabilities, and high-speed data search and processing. PolarDB is combined with computing algorithms to drive the output of business data and turn data into productivity.
This article describes how to combine PolarDB with MADlib to provide PolarDB with a machine learning feature.
The MADlib is undoubtedly a large and complete database machine learning library, including:
In this example, pgcat is deployed in the PolarDB container.
Go to the PolarDB environment:
docker exec -it 67e1eed1b4b6 bash
Download MADlib rpm:
https://cwiki.apache.org/confluence/display/MADLIB/Installation+Guide
wget https://dist.apache.org/repos/dist/release/MADlib/1.20.0/apache-MADlib-1.20.0-CentOS7.rpm
Install MADlib:
sudo rpm -ivh apache-MADlib-1.20.0-CentOS7.rpm
Load MADlib to the corresponding database of the PolarDB database (non-extension management):
/usr/local/MADlib/bin/madpack -s MADlib -p postgres -c [user[/password]@][host][:port][/database] install
or
/usr/local/MADlib/bin/madpack -s MADlib -p postgres install
Test the correctness of the MADlib installation:
/usr/local/MADlib/bin/madpack -s MADlib -p postgres install-check
Use MADlib:
[postgres@67e1eed1b4b6 ~]$ psql -h 127.0.0.1
psql (11.9)
Type "help" for help.
postgres=# set search_path =MADlib, "$user", public;
SET
postgres=# \dT+
List of data types
Schema | Name | Internal name | Size | Elements | Owner | Access privileges | Description
--------+-----------------------------------+-----------------------------------+-------+----------+----------+-------------------+-------------
MADlib | args_and_value_double | args_and_value_double | tuple | | postgres | |
MADlib | __arima_lm_result | __arima_lm_result | tuple | | postgres | |
MADlib | __arima_lm_stat_result | __arima_lm_stat_result | tuple | | postgres | |
MADlib | __arima_lm_sum_result | __arima_lm_sum_result | tuple | | postgres | |
MADlib | assoc_rules_results | assoc_rules_results | tuple | | postgres | |
MADlib | bytea8 | bytea8 | var | | postgres | |
MADlib | _cat_levels_type | _cat_levels_type | tuple | | postgres | |
MADlib | chi2_test_result | chi2_test_result | tuple | | postgres | |
MADlib | closest_column_result | closest_column_result | tuple | | postgres | |
MADlib | closest_columns_result | closest_columns_result | tuple | | postgres | |
MADlib | __clustered_agg_result | __clustered_agg_result | tuple | | postgres | |
MADlib | __clustered_lin_result | __clustered_lin_result | tuple | | postgres | |
MADlib | __clustered_log_result | __clustered_log_result | tuple | | postgres | |
MADlib | __clustered_mlog_result | __clustered_mlog_result | tuple | | postgres | |
MADlib | complex | complex | tuple | | postgres | |
MADlib | __coxph_a_b_result | __coxph_a_b_result | tuple | | postgres | |
MADlib | __coxph_cl_var_result | __coxph_cl_var_result | tuple | | postgres | |
MADlib | coxph_result | coxph_result | tuple | | postgres | |
MADlib | coxph_step_result | coxph_step_result | tuple | | postgres | |
MADlib | cox_prop_hazards_result | cox_prop_hazards_result | tuple | | postgres | |
MADlib | __cox_resid_stat_result | __cox_resid_stat_result | tuple | | postgres | |
MADlib | __dbscan_edge | __dbscan_edge | tuple | | postgres | |
MADlib | __dbscan_losses | __dbscan_losses | tuple | | postgres | |
MADlib | __dbscan_record | __dbscan_record | tuple | | postgres | |
MADlib | dense_linear_solver_result | dense_linear_solver_result | tuple | | postgres | |
MADlib | __elastic_net_result | __elastic_net_result | tuple | | postgres | |
MADlib | _flattened_tree | _flattened_tree | tuple | | postgres | |
MADlib | f_test_result | f_test_result | tuple | | postgres | |
MADlib | __glm_result_type | __glm_result_type | tuple | | postgres | |
MADlib | _grp_state_type | _grp_state_type | tuple | | postgres | |
MADlib | heteroskedasticity_test_result | heteroskedasticity_test_result | tuple | | postgres | |
MADlib | kmeans_result | kmeans_result | tuple | | postgres | |
MADlib | kmeans_state | kmeans_state | tuple | | postgres | |
MADlib | ks_test_result | ks_test_result | tuple | | postgres | |
MADlib | lda_result | lda_result | tuple | | postgres | |
MADlib | lincrf_result | lincrf_result | tuple | | postgres | |
MADlib | linear_svm_result | linear_svm_result | tuple | | postgres | |
MADlib | linregr_result | linregr_result | tuple | | postgres | |
MADlib | lmf_result | lmf_result | tuple | | postgres | |
MADlib | __logregr_result | __logregr_result | tuple | | postgres | |
MADlib | marginal_logregr_result | marginal_logregr_result | tuple | | postgres | |
MADlib | marginal_mlogregr_result | marginal_mlogregr_result | tuple | | postgres | |
MADlib | margins_result | margins_result | tuple | | postgres | |
MADlib | matrix_result | matrix_result | tuple | | postgres | |
MADlib | __mlogregr_cat_coef | __mlogregr_cat_coef | tuple | | postgres | |
MADlib | mlogregr_result | mlogregr_result | tuple | | postgres | |
MADlib | mlogregr_summary_result | mlogregr_summary_result | tuple | | postgres | |
MADlib | mlp_result | mlp_result | tuple | | postgres | |
MADlib | __multinom_result_type | __multinom_result_type | tuple | | postgres | |
MADlib | mw_test_result | mw_test_result | tuple | | postgres | |
MADlib | one_way_anova_result | one_way_anova_result | tuple | | postgres | |
MADlib | __ordinal_result_type | __ordinal_result_type | tuple | | postgres | |
MADlib | path_match_result | path_match_result | tuple | | postgres | |
MADlib | _pivotalr_lda_model | _pivotalr_lda_model | tuple | | postgres | |
MADlib | _prune_result_type | _prune_result_type | tuple | | postgres | |
MADlib | __rb_coxph_hs_result | __rb_coxph_hs_result | tuple | | postgres | |
MADlib | __rb_coxph_result | __rb_coxph_result | tuple | | postgres | |
MADlib | residual_norm_result | residual_norm_result | tuple | | postgres | |
MADlib | robust_linregr_result | robust_linregr_result | tuple | | postgres | |
MADlib | robust_logregr_result | robust_logregr_result | tuple | | postgres | |
MADlib | robust_mlogregr_result | robust_mlogregr_result | tuple | | postgres | |
MADlib | sparse_linear_solver_result | sparse_linear_solver_result | tuple | | postgres | |
MADlib | summary_result | summary_result | tuple | | postgres | |
MADlib | __svd_bidiagonal_matrix_result | __svd_bidiagonal_matrix_result | tuple | | postgres | |
MADlib | __svd_lanczos_result | __svd_lanczos_result | tuple | | postgres | |
MADlib | __svd_vec_mat_mult_result | __svd_vec_mat_mult_result | tuple | | postgres | |
MADlib | svec | svec | var | | postgres | |
MADlib | _tree_result_type | _tree_result_type | tuple | | postgres | |
MADlib | t_test_result | t_test_result | tuple | | postgres | |
MADlib | __utils_scales | __utils_scales | tuple | | postgres | |
MADlib | wsr_test_result | wsr_test_result | tuple | | postgres | |
MADlib | xgb_gridsearch_train_results_type | xgb_gridsearch_train_results_type | tuple | | postgres | |
public | vector | vector | var | | postgres | |
(73 rows)
postgres=# \do+
List of operators
Schema | Name | Left arg type | Right arg type | Result type | Function | Description
--------+------+--------------------+--------------------+------------------+-------------------------------+-------------
MADlib | %*% | double precision[] | double precision[] | double precision | MADlib.svec_dot |
MADlib | %*% | double precision[] | svec | double precision | MADlib.svec_dot |
MADlib | %*% | svec | double precision[] | double precision | MADlib.svec_dot |
MADlib | %*% | svec | svec | double precision | MADlib.svec_dot |
MADlib | * | double precision[] | double precision[] | svec | float8arr_mult_float8arr |
MADlib | * | double precision[] | svec | svec | float8arr_mult_svec |
MADlib | * | svec | double precision[] | svec | svec_mult_float8arr |
MADlib | * | svec | svec | svec | svec_mult |
MADlib | *|| | integer | svec | svec | svec_concat_replicate |
MADlib | + | double precision[] | double precision[] | svec | float8arr_plus_float8arr |
MADlib | + | double precision[] | svec | svec | float8arr_plus_svec |
MADlib | + | svec | double precision[] | svec | svec_plus_float8arr |
MADlib | + | svec | svec | svec | svec_plus |
MADlib | - | double precision[] | double precision[] | svec | float8arr_minus_float8arr |
MADlib | - | double precision[] | svec | svec | float8arr_minus_svec |
MADlib | - | svec | double precision[] | svec | svec_minus_float8arr |
MADlib | - | svec | svec | svec | svec_minus |
MADlib | / | double precision[] | double precision[] | svec | float8arr_div_float8arr |
MADlib | / | double precision[] | svec | svec | float8arr_div_svec |
MADlib | / | svec | double precision[] | svec | svec_div_float8arr |
MADlib | / | svec | svec | svec | svec_div |
MADlib | < | svec | svec | boolean | svec_lt |
MADlib | <= | svec | svec | boolean | svec_le |
MADlib | <> | svec | svec | boolean | svec_ne |
MADlib | = | svec | svec | boolean | svec_eq |
MADlib | == | svec | svec | boolean | svec_eq |
MADlib | > | svec | svec | boolean | svec_gt |
MADlib | >= | svec | svec | boolean | svec_ge |
MADlib | ^ | svec | svec | svec | svec_pow |
MADlib | || | svec | svec | svec | svec_concat |
public | + | vector | vector | vector | vector_add |
public | - | vector | vector | vector | vector_sub |
public | < | vector | vector | boolean | vector_lt |
public | <#> | vector | vector | double precision | vector_negative_inner_product |
public | <-> | vector | vector | double precision | l2_distance |
public | <= | vector | vector | boolean | vector_le |
public | <=> | vector | vector | double precision | cosine_distance |
public | <> | vector | vector | boolean | vector_ne |
public | = | vector | vector | boolean | vector_eq |
public | > | vector | vector | boolean | vector_gt |
public | >= | vector | vector | boolean | vector_ge |
(41 rows)
https://MADlib.apache.org/docs/latest/index.html
https://cwiki.apache.org/confluence/display/MADLIB/Installation+Guide
Configure a pgcat for the PolarDB Read/Write Splitting Connection Pool
digoal - December 20, 2021
digoal - October 23, 2018
digoal - June 26, 2019
digoal - June 26, 2019
digoal - September 12, 2019
digoal - April 26, 2021
An end-to-end platform that provides various machine learning algorithms to meet your data mining and analysis requirements.
Learn MoreAlibaba Cloud PolarDB is a cloud-native relational database service that decouples computing resources from storage resources
Learn MoreDesigned to address database challenges such as ultra-high concurrency, massive data storage, and large table performance bottlenecks.
Learn MoreAccelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology
Learn MoreMore Posts by digoal