×
Community Blog Configure MADlib for PolarDB to Realize the Database Machine Learning Function

Configure MADlib for PolarDB to Realize the Database Machine Learning Function

This article describes how to combine PolarDB with MADlib to provide PolarDB with a machine learning feature.

Background

PolarDB's cloud-native computing-storage separation architecture features low-cost data storage, high-efficiency scalability, high-speed multi-machine parallel computing capabilities, and high-speed data search and processing. PolarDB is combined with computing algorithms to drive the output of business data and turn data into productivity.

This article describes how to combine PolarDB with MADlib to provide PolarDB with a machine learning feature.

The MADlib is undoubtedly a large and complete database machine learning library, including:

  • Deep Learning
  • Graph
  • Model Selection
  • Sampling
  • Statistics
  • Supervised Learning
  • Time Series Analysis
  • Unsupervised Learning

Install MADlib to PolarDB to Provide PolarDB with the Machine Learning Feature

In this example, pgcat is deployed in the PolarDB container.

Go to the PolarDB environment:

docker exec -it 67e1eed1b4b6 bash

Download MADlib rpm:

https://cwiki.apache.org/confluence/display/MADLIB/Installation+Guide

wget https://dist.apache.org/repos/dist/release/MADlib/1.20.0/apache-MADlib-1.20.0-CentOS7.rpm

Install MADlib:

sudo rpm -ivh apache-MADlib-1.20.0-CentOS7.rpm

Load MADlib to the corresponding database of the PolarDB database (non-extension management):

/usr/local/MADlib/bin/madpack -s MADlib -p postgres -c [user[/password]@][host][:port][/database] install
  
or
  
/usr/local/MADlib/bin/madpack -s MADlib -p postgres install

Test the correctness of the MADlib installation:

/usr/local/MADlib/bin/madpack -s MADlib -p postgres install-check

Use MADlib:

[postgres@67e1eed1b4b6 ~]$ psql -h 127.0.0.1  
psql (11.9)  
Type "help" for help.  
  
postgres=# set search_path =MADlib, "$user", public;  
SET  
  
  
postgres=# \dT+  
                                                               List of data types  
 Schema |               Name                |           Internal name           | Size  | Elements |  Owner   | Access privileges | Description   
--------+-----------------------------------+-----------------------------------+-------+----------+----------+-------------------+-------------  
 MADlib | args_and_value_double             | args_and_value_double             | tuple |          | postgres |                   |   
 MADlib | __arima_lm_result                 | __arima_lm_result                 | tuple |          | postgres |                   |   
 MADlib | __arima_lm_stat_result            | __arima_lm_stat_result            | tuple |          | postgres |                   |   
 MADlib | __arima_lm_sum_result             | __arima_lm_sum_result             | tuple |          | postgres |                   |   
 MADlib | assoc_rules_results               | assoc_rules_results               | tuple |          | postgres |                   |   
 MADlib | bytea8                            | bytea8                            | var   |          | postgres |                   |   
 MADlib | _cat_levels_type                  | _cat_levels_type                  | tuple |          | postgres |                   |   
 MADlib | chi2_test_result                  | chi2_test_result                  | tuple |          | postgres |                   |   
 MADlib | closest_column_result             | closest_column_result             | tuple |          | postgres |                   |   
 MADlib | closest_columns_result            | closest_columns_result            | tuple |          | postgres |                   |   
 MADlib | __clustered_agg_result            | __clustered_agg_result            | tuple |          | postgres |                   |   
 MADlib | __clustered_lin_result            | __clustered_lin_result            | tuple |          | postgres |                   |   
 MADlib | __clustered_log_result            | __clustered_log_result            | tuple |          | postgres |                   |   
 MADlib | __clustered_mlog_result           | __clustered_mlog_result           | tuple |          | postgres |                   |   
 MADlib | complex                           | complex                           | tuple |          | postgres |                   |   
 MADlib | __coxph_a_b_result                | __coxph_a_b_result                | tuple |          | postgres |                   |   
 MADlib | __coxph_cl_var_result             | __coxph_cl_var_result             | tuple |          | postgres |                   |   
 MADlib | coxph_result                      | coxph_result                      | tuple |          | postgres |                   |   
 MADlib | coxph_step_result                 | coxph_step_result                 | tuple |          | postgres |                   |   
 MADlib | cox_prop_hazards_result           | cox_prop_hazards_result           | tuple |          | postgres |                   |   
 MADlib | __cox_resid_stat_result           | __cox_resid_stat_result           | tuple |          | postgres |                   |   
 MADlib | __dbscan_edge                     | __dbscan_edge                     | tuple |          | postgres |                   |   
 MADlib | __dbscan_losses                   | __dbscan_losses                   | tuple |          | postgres |                   |   
 MADlib | __dbscan_record                   | __dbscan_record                   | tuple |          | postgres |                   |   
 MADlib | dense_linear_solver_result        | dense_linear_solver_result        | tuple |          | postgres |                   |   
 MADlib | __elastic_net_result              | __elastic_net_result              | tuple |          | postgres |                   |   
 MADlib | _flattened_tree                   | _flattened_tree                   | tuple |          | postgres |                   |   
 MADlib | f_test_result                     | f_test_result                     | tuple |          | postgres |                   |   
 MADlib | __glm_result_type                 | __glm_result_type                 | tuple |          | postgres |                   |   
 MADlib | _grp_state_type                   | _grp_state_type                   | tuple |          | postgres |                   |   
 MADlib | heteroskedasticity_test_result    | heteroskedasticity_test_result    | tuple |          | postgres |                   |   
 MADlib | kmeans_result                     | kmeans_result                     | tuple |          | postgres |                   |   
 MADlib | kmeans_state                      | kmeans_state                      | tuple |          | postgres |                   |   
 MADlib | ks_test_result                    | ks_test_result                    | tuple |          | postgres |                   |   
 MADlib | lda_result                        | lda_result                        | tuple |          | postgres |                   |   
 MADlib | lincrf_result                     | lincrf_result                     | tuple |          | postgres |                   |   
 MADlib | linear_svm_result                 | linear_svm_result                 | tuple |          | postgres |                   |   
 MADlib | linregr_result                    | linregr_result                    | tuple |          | postgres |                   |   
 MADlib | lmf_result                        | lmf_result                        | tuple |          | postgres |                   |   
 MADlib | __logregr_result                  | __logregr_result                  | tuple |          | postgres |                   |   
 MADlib | marginal_logregr_result           | marginal_logregr_result           | tuple |          | postgres |                   |   
 MADlib | marginal_mlogregr_result          | marginal_mlogregr_result          | tuple |          | postgres |                   |   
 MADlib | margins_result                    | margins_result                    | tuple |          | postgres |                   |   
 MADlib | matrix_result                     | matrix_result                     | tuple |          | postgres |                   |   
 MADlib | __mlogregr_cat_coef               | __mlogregr_cat_coef               | tuple |          | postgres |                   |   
 MADlib | mlogregr_result                   | mlogregr_result                   | tuple |          | postgres |                   |   
 MADlib | mlogregr_summary_result           | mlogregr_summary_result           | tuple |          | postgres |                   |   
 MADlib | mlp_result                        | mlp_result                        | tuple |          | postgres |                   |   
 MADlib | __multinom_result_type            | __multinom_result_type            | tuple |          | postgres |                   |   
 MADlib | mw_test_result                    | mw_test_result                    | tuple |          | postgres |                   |   
 MADlib | one_way_anova_result              | one_way_anova_result              | tuple |          | postgres |                   |   
 MADlib | __ordinal_result_type             | __ordinal_result_type             | tuple |          | postgres |                   |   
 MADlib | path_match_result                 | path_match_result                 | tuple |          | postgres |                   |   
 MADlib | _pivotalr_lda_model               | _pivotalr_lda_model               | tuple |          | postgres |                   |   
 MADlib | _prune_result_type                | _prune_result_type                | tuple |          | postgres |                   |   
 MADlib | __rb_coxph_hs_result              | __rb_coxph_hs_result              | tuple |          | postgres |                   |   
 MADlib | __rb_coxph_result                 | __rb_coxph_result                 | tuple |          | postgres |                   |   
 MADlib | residual_norm_result              | residual_norm_result              | tuple |          | postgres |                   |   
 MADlib | robust_linregr_result             | robust_linregr_result             | tuple |          | postgres |                   |   
 MADlib | robust_logregr_result             | robust_logregr_result             | tuple |          | postgres |                   |   
 MADlib | robust_mlogregr_result            | robust_mlogregr_result            | tuple |          | postgres |                   |   
 MADlib | sparse_linear_solver_result       | sparse_linear_solver_result       | tuple |          | postgres |                   |   
 MADlib | summary_result                    | summary_result                    | tuple |          | postgres |                   |   
 MADlib | __svd_bidiagonal_matrix_result    | __svd_bidiagonal_matrix_result    | tuple |          | postgres |                   |   
 MADlib | __svd_lanczos_result              | __svd_lanczos_result              | tuple |          | postgres |                   |   
 MADlib | __svd_vec_mat_mult_result         | __svd_vec_mat_mult_result         | tuple |          | postgres |                   |   
 MADlib | svec                              | svec                              | var   |          | postgres |                   |   
 MADlib | _tree_result_type                 | _tree_result_type                 | tuple |          | postgres |                   |   
 MADlib | t_test_result                     | t_test_result                     | tuple |          | postgres |                   |   
 MADlib | __utils_scales                    | __utils_scales                    | tuple |          | postgres |                   |   
 MADlib | wsr_test_result                   | wsr_test_result                   | tuple |          | postgres |                   |   
 MADlib | xgb_gridsearch_train_results_type | xgb_gridsearch_train_results_type | tuple |          | postgres |                   |   
 public | vector                            | vector                            | var   |          | postgres |                   |   
(73 rows)  
  
postgres=#   \do+  
                                                    List of operators  
 Schema | Name |   Left arg type    |   Right arg type   |   Result type    |           Function            | Description   
--------+------+--------------------+--------------------+------------------+-------------------------------+-------------  
 MADlib | %*%  | double precision[] | double precision[] | double precision | MADlib.svec_dot               |   
 MADlib | %*%  | double precision[] | svec               | double precision | MADlib.svec_dot               |   
 MADlib | %*%  | svec               | double precision[] | double precision | MADlib.svec_dot               |   
 MADlib | %*%  | svec               | svec               | double precision | MADlib.svec_dot               |   
 MADlib | *    | double precision[] | double precision[] | svec             | float8arr_mult_float8arr      |   
 MADlib | *    | double precision[] | svec               | svec             | float8arr_mult_svec           |   
 MADlib | *    | svec               | double precision[] | svec             | svec_mult_float8arr           |   
 MADlib | *    | svec               | svec               | svec             | svec_mult                     |   
 MADlib | *||  | integer            | svec               | svec             | svec_concat_replicate         |   
 MADlib | +    | double precision[] | double precision[] | svec             | float8arr_plus_float8arr      |   
 MADlib | +    | double precision[] | svec               | svec             | float8arr_plus_svec           |   
 MADlib | +    | svec               | double precision[] | svec             | svec_plus_float8arr           |   
 MADlib | +    | svec               | svec               | svec             | svec_plus                     |   
 MADlib | -    | double precision[] | double precision[] | svec             | float8arr_minus_float8arr     |   
 MADlib | -    | double precision[] | svec               | svec             | float8arr_minus_svec          |   
 MADlib | -    | svec               | double precision[] | svec             | svec_minus_float8arr          |   
 MADlib | -    | svec               | svec               | svec             | svec_minus                    |   
 MADlib | /    | double precision[] | double precision[] | svec             | float8arr_div_float8arr       |   
 MADlib | /    | double precision[] | svec               | svec             | float8arr_div_svec            |   
 MADlib | /    | svec               | double precision[] | svec             | svec_div_float8arr            |   
 MADlib | /    | svec               | svec               | svec             | svec_div                      |   
 MADlib | <    | svec               | svec               | boolean          | svec_lt                       |   
 MADlib | <=   | svec               | svec               | boolean          | svec_le                       |   
 MADlib | <>   | svec               | svec               | boolean          | svec_ne                       |   
 MADlib | =    | svec               | svec               | boolean          | svec_eq                       |   
 MADlib | ==   | svec               | svec               | boolean          | svec_eq                       |   
 MADlib | >    | svec               | svec               | boolean          | svec_gt                       |   
 MADlib | >=   | svec               | svec               | boolean          | svec_ge                       |   
 MADlib | ^    | svec               | svec               | svec             | svec_pow                      |   
 MADlib | ||   | svec               | svec               | svec             | svec_concat                   |   
 public | +    | vector             | vector             | vector           | vector_add                    |   
 public | -    | vector             | vector             | vector           | vector_sub                    |   
 public | <    | vector             | vector             | boolean          | vector_lt                     |   
 public | <#>  | vector             | vector             | double precision | vector_negative_inner_product |   
 public | <->  | vector             | vector             | double precision | l2_distance                   |   
 public | <=   | vector             | vector             | boolean          | vector_le                     |   
 public | <=>  | vector             | vector             | double precision | cosine_distance               |   
 public | <>   | vector             | vector             | boolean          | vector_ne                     |   
 public | =    | vector             | vector             | boolean          | vector_eq                     |   
 public | >    | vector             | vector             | boolean          | vector_gt                     |   
 public | >=   | vector             | vector             | boolean          | vector_ge                     |   
(41 rows)  

References

https://MADlib.apache.org/docs/latest/index.html

https://cwiki.apache.org/confluence/display/MADLIB/Installation+Guide

0 1 0
Share on

digoal

276 posts | 24 followers

You may also like

Comments

digoal

276 posts | 24 followers

Related Products

  • Platform For AI

    A platform that provides enterprise-level data modeling services based on machine learning algorithms to quickly meet your needs for data-driven operations.

    Learn More
  • PolarDB for PostgreSQL

    Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.

    Learn More
  • PolarDB for Xscale

    Alibaba Cloud PolarDB for Xscale (PolarDB-X) is a cloud-native high-performance distributed database service independently developed by Alibaba Cloud.

    Learn More
  • AI Acceleration Solution

    Accelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology

    Learn More