全部产品
Search
文档中心

Platform For AI:Praktik terbaik untuk pembuatan fitur

更新时间:Nov 01, 2025

Anda dapat menggunakan FeatureStore untuk menghasilkan fitur di Platform for AI (PAI). FeatureStore menstandarkan operasi pembuatan yang umum digunakan. Anda hanya perlu melakukan beberapa konfigurasi untuk menghasilkan fitur, sehingga secara efisien menyederhanakan proses pembuatan fitur. Pembuatan fitur banyak digunakan dalam berbagai skenario seperti rekomendasi, periklanan, kontrol risiko, dan pembelajaran mesin. Topik ini menjelaskan cara menggunakan tabel sumber untuk menghasilkan fitur dan tabel sampel dalam skenario rekomendasi.

Prasyarat

Pastikan bahwa persiapan yang dijelaskan dalam tabel berikut sudah selesai.

Layanan

Operasi

PAI

MaxCompute

DataWorks

1. Lakukan persiapan

Persiapkan tabel sumber

Dalam skenario rekomendasi, pembuatan fitur memerlukan tiga tabel sumber berikut. Tabel-tabel ini dapat digunakan untuk menghasilkan ratusan fitur, memfasilitasi pembuatan model agar sesuai dengan tujuan.

  • rec_sln_demo_user_table_preprocess_v1: tabel pengguna yang berisi fitur dasar pengguna, seperti jenis kelamin, usia, kota, dan jumlah pengikut.

  • rec_sln_demo_item_table_preprocess_v1: tabel item yang berisi fitur dasar item, seperti kategori, penulis, jumlah klik kumulatif, dan jumlah pujian kumulatif.

  • rec_sln_demo_behavior_table_preprocess_v1: tabel perilaku yang berisi fitur perilaku. Misalnya, pengguna mengklik item pada waktu tertentu.

Tabel data disimpan di ruang kerja pai_online_project yang terlihat oleh semua pengguna. Tabel data hanya menyimpan data simulasi. Anda harus mengeksekusi pernyataan SQL di DataWorks untuk menyinkronkan data dari tabel-tabel sebelumnya di ruang kerja pai_online_project ke proyek MaxCompute Anda. Ikuti langkah-langkah berikut:

  1. Masuk ke Konsol DataWorks.

  2. Di panel navigasi di sebelah kiri, klik Data Development & O&M > Data Development.

  3. Pilih ruang kerja DataWorks yang Anda buat dan klik Enter Data Development.

  4. Arahkan kursor ke New, dan pilih New Node > MaxCompute > ODPS SQL. Di halaman yang muncul, konfigurasikan parameter node.

    Parameter

    Nilai yang Disarankan

    Engine Instance

    Pilih mesin MaxCompute yang Anda buat.

    Node Type

    ODPS SQL

    Path

    Alur Bisnis/Workflow/MaxCompute

    Name

    Masukkan nama kustom.

  5. Klik Confirm.

  6. Di editor SQL node baru, eksekusi pernyataan SQL berikut untuk menyinkronkan data di tabel pengguna, tabel item, dan tabel perilaku dari ruang kerja pai_online_project ke proyek MaxCompute Anda. Pilih grup sumber daya eksklusif yang ada untuk menyinkronkan data.

    • Eksekusi pernyataan SQL berikut untuk menyinkronkan data di tabel pengguna rec_sln_demo_user_table_preprocess_v1:

      CREATE TABLE IF NOT EXISTS rec_sln_demo_user_table_preprocess_v1
      like pai_online_project.rec_sln_demo_user_table_preprocess_v1
      STORED AS ALIORC  
      LIFECYCLE 90;
      
      INSERT OVERWRITE TABLE rec_sln_demo_user_table_preprocess_v1 PARTITION(ds)
      SELECT *
      FROM pai_online_project.rec_sln_demo_user_table_preprocess_v1
      WHERE ds >= '20240530' and ds <='20240605';
    • Eksekusi pernyataan SQL berikut untuk menyinkronkan data di tabel item rec_sln_demo_item_table_preprocess_v1:

      CREATE TABLE IF NOT EXISTS rec_sln_demo_item_table_preprocess_v1
      like pai_online_project.rec_sln_demo_item_table_preprocess_v1
      STORED AS ALIORC  
      LIFECYCLE 90;
      
      INSERT OVERWRITE TABLE rec_sln_demo_item_table_preprocess_v1 PARTITION(ds)
      SELECT *
      FROM pai_online_project.rec_sln_demo_item_table_preprocess_v1
      WHERE ds >= '20240530' and ds <='20240605';
    • Eksekusi pernyataan SQL berikut untuk menyinkronkan data di tabel perilaku rec_sln_demo_behavior_table_preprocess_v1:

      CREATE TABLE IF NOT EXISTS rec_sln_demo_behavior_table_preprocess_v1
      like pai_online_project.rec_sln_demo_behavior_table_preprocess_v1
      STORED AS ALIORC  
      LIFECYCLE 90;
      
      INSERT OVERWRITE TABLE rec_sln_demo_behavior_table_preprocess_v1 PARTITION(ds)
      SELECT *
      FROM pai_online_project.rec_sln_demo_behavior_table_preprocess_v1
      WHERE ds >= '20240530' and ds <='20240605';

Instal FeatureStore SDK untuk Python

Disarankan menggunakan Jupyter Notebook untuk menjalankan kode berikut.

  • Instal FeatureStore SDK untuk Python di lingkungan Python3.

    %pip install https://feature-store-py.oss-cn-beijing.aliyuncs.com/package/feature_store_py-2.0.2-py3-none-any.whl
  • Impor modul fungsional yang diperlukan.

    import os
    from feature_store_py import FeatureStoreClient
    from feature_store_py.fs_datasource import MaxComputeDataSource
    from feature_store_py.feature_engineering import TableTransform, Condition, DayOf, ComboTransform, Feature, AggregationTransform, auto_count_feature_transform, WindowTransform, auto_window_feature_transform

2. Transformasi tabel sumber

Setelah menyelesaikan persiapan, Anda dapat melihat tabel pengguna rec_sln_demo_user_table_preprocess_v1, tabel item rec_sln_demo_item_table_preprocess_v1, dan tabel perilaku rec_sln_demo_behavior_table_preprocess_v1 di ruang kerja Anda.

Sebelum menghasilkan fitur menggunakan tabel-tabel tersebut, pra-proses tabel-tabel tersebut untuk memfasilitasi pembuatan fitur. Anda dapat mengeksekusi pernyataan SQL berikut menggunakan node ODPS SQL yang Anda buat untuk pra-memproses tabel-tabel tersebut. Untuk informasi lebih lanjut, lihat Persiapkan Tabel Sumber.

  • Gabungkan tabel pengguna, tabel item, dan tabel perilaku.

    CREATE TABLE IF NOT EXISTS rec_sln_demo_behavior_table_preprocess_wide_v1 
    (
        request_id bigint
        ,user_id string
        ,page string
        ,net_type string
        ,day_h bigint COMMENT 'Jam saat perilaku terjadi.'
        ,week_day bigint COMMENT 'Hari dalam seminggu saat perilaku terjadi.'
        ,event_unix_time bigint
        ,item_id string
        ,event string
        ,playtime double
        ,gender string
        ,age bigint
        ,city string
        ,item_cnt bigint
        ,follow_cnt bigint
        ,follower_cnt bigint
        ,is_new_user bigint
        ,tags string
        ,duration double
        ,category string
        ,author bigint
        ,click_count bigint
        ,praise_count bigint
        ,is_new_item bigint
    )
    PARTITIONED BY 
    (
        ds string
    )
    LIFECYCLE 90
    ;
    INSERT OVERWRITE TABLE rec_sln_demo_behavior_table_preprocess_wide_v1 PARTITION(ds='${bdp.system.bizdate}')
    SELECT  sq0.request_id
            ,sq0.user_id
            ,sq0.page
            ,sq0.net_type
            ,sq0.day_h
            ,sq0.week_day
            ,sq0.event_unix_time
            ,sq0.item_id
            ,sq0.event
            ,sq0.playtime
            ,sq1.gender
            ,sq1.age
            ,sq1.city
            ,sq1.item_cnt
            ,sq1.follow_cnt
            ,sq1.follower_cnt
            ,sq1.is_new_user
            ,sq1.tags
            ,sq2.duration
            ,sq2.category
            ,sq2.author
            ,sq2.click_count
            ,sq2.praise_count
            ,sq2.is_new_item
    FROM    (
                SELECT  *
                FROM    rec_sln_demo_behavior_table_preprocess_v1
                WHERE   ds = '${bdp.system.bizdate}'
            ) sq0
    LEFT JOIN (
                  SELECT  *
                  FROM    rec_sln_demo_user_table_preprocess_v1
                  WHERE   ds = '${bdp.system.bizdate}'
              ) sq1
    ON      sq0.user_id = sq1.user_id LEFT
    JOIN    (
                SELECT  *
                FROM    rec_sln_demo_item_table_preprocess_v1
                WHERE   ds = '${bdp.system.bizdate}'
            ) sq2
    ON      sq0.item_id = sq2.item_id
    ;
  • Pra-proses tabel perilaku untuk mentransformasikannya menjadi tabel label. Label diperlukan untuk pelatihan model. Dalam contoh ini, klik, durasi pemutaran, dan pujian digunakan sebagai label.

    CREATE TABLE IF NOT EXISTS rec_sln_demo_fs_label_table_v1
    (
        request_id bigint
        ,user_id string
        ,page string
        ,net_type string
        ,day_h bigint COMMENT 'Jam saat perilaku terjadi.'
        ,week_day bigint COMMENT 'Hari dalam seminggu saat perilaku terjadi.'
        ,day_min string
        ,event_unix_time bigint
        ,item_id string
        ,playtime double
        ,is_click BIGINT
        ,ln_playtime DOUBLE
        ,is_praise BIGINT
    )
    PARTITIONED BY 
    (
        ds string
    )
    LIFECYCLE 90
    ;
    INSERT OVERWRITE TABLE rec_sln_demo_fs_label_table_v1 PARTITION(ds='${bdp.system.bizdate}')
    SELECT  request_id
            ,user_id
            ,MAX(page) page
            ,MAX(net_type) net_type
            ,MAX(day_h) day_h
            ,MAX(week_day) week_day
            ,TO_CHAR(FROM_UNIXTIME(MIN(event_unix_time)),'yyyymmddhhmi') day_min
            ,MAX(event_unix_time) event_unix_time
            ,item_id
            ,MAX(playtime) playtime
            ,max(if(event='click', 1, 0)) is_click
            ,ln(sum(playtime) + 1) ln_playtime
            ,max(if(event='praise', 1, 0)) is_praise
    FROM    rec_sln_demo_behavior_table_preprocess_v1
    WHERE   ds = '${bdp.system.bizdate}'
    GROUP BY request_id
             ,user_id
             ,item_id
    ;

Setelah mengeksekusi pernyataan SQL sebelumnya, Anda dapat memperoleh tabel-tabel berikut:

  • rec_sln_demo_behavior_table_preprocess_wide_v1: tabel perilaku baru yang digunakan untuk menghasilkan fitur.

  • rec_sln_demo_fs_label_table_v1: tabel label yang digunakan untuk menghasilkan tabel sampel.

3. Hasilkan fitur

Anda dapat memanggil fungsi ekstensi otomatis untuk menghasilkan fitur. Ratusan fitur dapat dihasilkan hanya dengan beberapa baris kode.

Pembuatan fitur memerlukan pembuatan fitur pengguna dan fitur item. Ikuti langkah-langkah berikut:

Hasilkan fitur pengguna

Disarankan menggunakan Jupyter Notebook untuk menjalankan kode berikut.

  1. Inisialisasi klien.

    access_key_id=os.environ.get ("ALIBABA_CLOUD_ACCESS_KEY_ID") # Masukkan ID AccessKey Anda.
    access_key_secret=os.environ.get ("ALIBABA_CLOUD_ACCESS_KEY_SECRET") # Masukkan Rahasia AccessKey Anda.
    project='project_name' # Masukkan nama proyek Anda.
    region='cn-hangzhou' # Masukkan wilayah tempat proyek Anda berada. Misalnya, jika proyek Anda berada di China (Hangzhou), masukkan cn-hangzhou.
    fs_client = FeatureStoreClient(access_key_id=access_key_id, access_key_secret=access_key_secret, region=region)
  2. Tentukan tabel sumber tempat Anda ingin melakukan transformasi fitur.

    input_bhv_table_name = "rec_sln_demo_behavior_table_preprocess_wide_v1"
    ds_bhv = MaxComputeDataSource(table=input_bhv_table_name, project=project)
    
    input_user_table_name = "rec_sln_demo_user_table_preprocess_v1"
    ds_user = MaxComputeDataSource(table=input_user_table_name, project=project)
    
    input_item_table_name = "rec_sln_demo_item_table_preprocess_v1"
    ds_item = MaxComputeDataSource(table=input_item_table_name, project=project)
  3. Tentukan nama untuk tabel fitur pengguna yang akan dihasilkan dengan melakukan operasi JoinTransform dan AggregationTransform.

    agg_user_table_v1 = 'rec_sln_demo_user_table_preprocess_agg_v1'
  4. Implementasikan ekstensi otomatis untuk fitur statistik menggunakan fungsi ekstensi otomatis.

    name_prefix = "user"
    input_list = ["playtime", "duration", "click_count", "praise_count"]
    event_name = 'event'
    event_type = 'expr'
    group_by_key = "user_id"
    window_size = [3,7,15]
    user_count_feature_list = auto_count_feature_transform(name_prefix, input_list, event_name, event_type, group_by_key, window_size)
    print("len_count_feature_list = ", len(user_count_feature_list))
    print("count_feature_list = ", user_count_feature_list)
  5. Jalankan pipeline dan lihat hasil pembuatan tabel.

    1. Buat pipeline.

      agg_user_bhv_pipeline = fs_client.create_pipeline(ds_bhv).add_feature_transform(user_count_feature_list)
      agg_user_pipeline =fs_client.create_pipeline(ds_user, agg_user_table_v1).merge(agg_user_bhv_pipeline, keep_input_columns=False)
    2. Jalankan pipeline.

      execute_date = '20240605'
      output_agg_user_table = agg_user_pipeline.execute(execute_date, drop_table=True)
    3. Lihat hasil pembuatan tabel.

      agg_user_ret = output_agg_user_table.to_pandas(execute_date, limit=20)
      agg_user_ret
  6. Tentukan nama untuk tabel fitur pengguna yang akan dihasilkan dengan melakukan operasi WindowTransform.

    win_user_table_v1 = 'rec_sln_demo_user_table_preprocess_win_v1'
  7. Gunakan fungsi ekstensi otomatis bawaan untuk melakukan operasi WindowTransform.

    name_prefix = 'user'
    input_list = ['day_h', 'category']
    agg_field = ['duration', 'click_count']
    event_name = 'event'
    event_type = 'expr'
    group_by_key = 'user_id'
    window_size = [7, 15, 30, 45]
    user_win_feature_list = auto_window_feature_transform(name_prefix, input_list, agg_field, event_name, event_type, group_by_key, window_size)
    print("len_user_win_feature_list = ", len(user_win_feature_list))
    print("user_win_feature_list = ", user_win_feature_list)
  8. Jalankan pipeline dan lihat hasil pembuatan tabel.

    1. Buat pipeline.

      win_user_bhv_pipeline = fs_client.create_pipeline(ds_bhv).add_feature_transform(user_win_feature_list)
      win_user_pipeline = fs_client.create_pipeline(ds_user, win_user_table_v1).merge(win_user_bhv_pipeline, keep_input_columns=False)
    2. Jalankan pipeline.

      execute_date = '20240605'
      output_win_user_table = win_user_pipeline.execute(execute_date, drop_table=True)
      
      # Untuk informasi lebih lanjut, lihat dokumentasi terkait. Operasi ini melibatkan tabel antara. Pertama kali Anda menjalankan kode, Anda harus mengisi ulang data, yang mungkin memerlukan waktu lama. Anda dapat menjalankan kode berikut untuk mengisi ulang data:
      # output_win_user_table = win_user_pipeline.execute(execute_date, drop_table=True, backfill_partitions=True)
    3. Lihat hasil pembuatan tabel.

      win_user_ret = output_win_user_table.to_pandas(execute_date, limit=20)
      win_user_ret

Hasilkan fitur item

Setelah menghasilkan fitur pengguna, Anda dapat menghasilkan fitur item berdasarkan jenis item. Ikuti langkah-langkah berikut:

  1. Tentukan nama untuk tabel fitur item yang akan dihasilkan dengan melakukan operasi AggregationTransform.

    agg_item_table_v1 = 'rec_sln_demo_item_table_preprocess_agg_v1'
  2. Implementasikan ekstensi otomatis untuk fitur statistik menggunakan fungsi ekstensi otomatis.

    name_prefix = "item"
    input_list = ["item_cnt", "follow_cnt", "follower_cnt"]
    event_name = 'event'
    event_type = 'expr'
    group_by_key = "item_id"
    window_size = [3,7,15]
    item_count_feature_list = auto_count_feature_transform(name_prefix, input_list, event_name, event_type, group_by_key, window_size)
    print("len_count_feature_list = ", len(item_count_feature_list))
    print("count_feature_list = ", item_count_feature_list)
  3. Jalankan pipeline dan lihat hasil pembuatan tabel.

    1. Buat pipeline.

      agg_item_bhv_pipeline =fs_client.create_pipeline(ds_bhv).add_feature_transform(item_count_feature_list)
      agg_item_pipeline =fs_client.create_pipeline(ds_item, agg_item_table_v1).merge(agg_item_bhv_pipeline, keep_input_columns=False)
    2. Jalankan pipeline.

      execute_date = '20240605'
      output_agg_item_table = agg_item_pipeline.execute(execute_date, drop_table=True)
    3. Lihat hasil pembuatan tabel.

      agg_item_ret = output_agg_item_table.to_pandas(execute_date, limit=20)
      agg_item_ret
  4. Tentukan nama untuk tabel fitur item yang akan dihasilkan dengan melakukan operasi WindowTransform.

    win_item_table_v1 = 'rec_sln_demo_item_table_preprocess_win_v1'
  5. Gunakan fungsi ekstensi otomatis bawaan untuk melakukan operasi WindowTransform.

    name_prefix = 'item'
    input_list = ['day_h', 'category']
    agg_field = ['click_count', 'praise_count']
    event_name = 'event'
    event_type = 'expr'
    group_by_key = 'item_id'
    window_size = [7, 15, 30, 45]
    item_win_feature_list = auto_window_feature_transform(name_prefix, input_list, agg_field, event_name, event_type, group_by_key, window_size)
    print("len_item_win_feature_list = ", len(item_win_feature_list))
    print("item_win_feature_list = ", item_win_feature_list)
  6. Jalankan pipeline dan lihat hasil pembuatan tabel.

    1. Buat pipeline.

      win_item_bhv_pipeline = fs_client.create_pipeline(ds_bhv).add_feature_transform(item_win_feature_list)
      win_item_pipeline = fs_client.create_pipeline(ds_item, win_item_table_v1).merge(win_item_bhv_pipeline, keep_input_columns=False)
    2. Jalankan pipeline.

      execute_date = '20240605'
      output_win_item_table = win_item_pipeline.execute(execute_date, drop_table=True)
      
      # Untuk informasi lebih lanjut, lihat dokumentasi terkait. Operasi ini melibatkan tabel antara. Pertama kali Anda menjalankan kode, Anda harus mengisi ulang data, yang mungkin memerlukan waktu lama. Anda dapat menjalankan kode berikut untuk mengisi ulang data:
      # output_win_item_table = win_item_pipeline.execute(execute_date, drop_table=True, backfill_partitions=True)
    3. Lihat hasil pembuatan tabel.

      win_item_ret = output_win_item_table.to_pandas(execute_date, limit=20)
      win_item_ret

4. Hasilkan tabel sampel

Setelah melakukan operasi sebelumnya, Anda dapat memperoleh tabel label, tiga tabel fitur pengguna, dan tiga tabel fitur item. Tiga tabel fitur tersebut mencakup tabel sumber dan dua tabel fitur yang dihasilkan setelah pemrosesan fitur. Kemudian, Anda dapat mengeksekusi pernyataan SQL berikut menggunakan node ODPS SQL yang Anda buat untuk menggabungkan tujuh tabel menjadi tabel sampel lengkap. Untuk informasi lebih lanjut, lihat Persiapkan Tabel Sumber.

CREATE TABLE IF NOT EXISTS fs_demo_fs_engineering_v1_training_set(
	request_id BIGINT,
	 user_id STRING,
	 page STRING,
	 net_type STRING,
	 day_h BIGINT,
	 week_day BIGINT,
	 day_min STRING,
	 event_unix_time BIGINT,
	 item_id STRING,
	 playtime DOUBLE,
	 is_click BIGINT,
	 ln_playtime DOUBLE,
	 is_praise BIGINT,
	 duration DOUBLE,
	 category STRING,
	 author BIGINT,
	 click_count BIGINT,
	 praise_count BIGINT,
	 is_new_item BIGINT,
	 item__sum_item_cnt_3d BIGINT,
	 item__sum_follow_cnt_3d BIGINT,
	 item__sum_follower_cnt_3d BIGINT,
	 item__max_item_cnt_3d BIGINT,
	 item__max_follow_cnt_3d BIGINT,
	 item__max_follower_cnt_3d BIGINT,
	 item__min_item_cnt_3d BIGINT,
	 item__min_follow_cnt_3d BIGINT,
	 item__min_follower_cnt_3d BIGINT,
	 item__avg_item_cnt_3d DOUBLE,
	 item__avg_follow_cnt_3d DOUBLE,
	 item__avg_follower_cnt_3d DOUBLE,
	 item__sum_item_cnt_7d BIGINT,
	 item__sum_follow_cnt_7d BIGINT,
	 item__sum_follower_cnt_7d BIGINT,
	 item__max_item_cnt_7d BIGINT,
	 item__max_follow_cnt_7d BIGINT,
	 item__max_follower_cnt_7d BIGINT,
	 item__min_item_cnt_7d BIGINT,
	 item__min_follow_cnt_7d BIGINT,
	 item__min_follower_cnt_7d BIGINT,
	 item__avg_item_cnt_7d DOUBLE,
	 item__avg_follow_cnt_7d DOUBLE,
	 item__avg_follower_cnt_7d DOUBLE,
	 item__sum_item_cnt_15d BIGINT,
	 item__sum_follow_cnt_15d BIGINT,
	 item__sum_follower_cnt_15d BIGINT,
	 item__max_item_cnt_15d BIGINT,
	 item__max_follow_cnt_15d BIGINT,
	 item__max_follower_cnt_15d BIGINT,
	 item__min_item_cnt_15d BIGINT,
	 item__min_follow_cnt_15d BIGINT,
	 item__min_follower_cnt_15d BIGINT,
	 item__avg_item_cnt_15d DOUBLE,
	 item__avg_follow_cnt_15d DOUBLE,
	 item__avg_follower_cnt_15d DOUBLE,
	 item__kv_day_h_click_count_sum_7d STRING,
	 item__kv_category_click_count_sum_7d STRING,
	 item__kv_day_h_praise_count_sum_7d STRING,
	 item__kv_category_praise_count_sum_7d STRING,
	 item__kv_day_h_click_count_max_7d STRING,
	 item__kv_category_click_count_max_7d STRING,
	 item__kv_day_h_praise_count_max_7d STRING,
	 item__kv_category_praise_count_max_7d STRING,
	 item__kv_day_h_click_count_min_7d STRING,
	 item__kv_category_click_count_min_7d STRING,
	 item__kv_day_h_praise_count_min_7d STRING,
	 item__kv_category_praise_count_min_7d STRING,
	 item__kv_day_h_click_count_avg_7d STRING,
	 item__kv_category_click_count_avg_7d STRING,
	 item__kv_day_h_praise_count_avg_7d STRING,
	 item__kv_category_praise_count_avg_7d STRING,
	 item__kv_day_h_click_count_sum_15d STRING,
	 item__kv_category_click_count_sum_15d STRING,
	 item__kv_day_h_praise_count_sum_15d STRING,
	 item__kv_category_praise_count_sum_15d STRING,
	 item__kv_day_h_click_count_max_15d STRING,
	 item__kv_category_click_count_max_15d STRING,
	 item__kv_day_h_praise_count_max_15d STRING,
	 item__kv_category_praise_count_max_15d STRING,
	 item__kv_day_h_click_count_min_15d STRING,
	 item__kv_category_click_count_min_15d STRING,
	 item__kv_day_h_praise_count_min_15d STRING,
	 item__kv_category_praise_count_min_15d STRING,
	 item__kv_day_h_click_count_avg_15d STRING,
	 item__kv_category_click_count_avg_15d STRING,
	 item__kv_day_h_praise_count_avg_15d STRING,
	 item__kv_category_praise_count_avg_15d STRING,
	 item__kv_day_h_click_count_sum_30d STRING,
	 item__kv_category_click_count_sum_30d STRING,
	 item__kv_day_h_praise_count_sum_30d STRING,
	 item__kv_category_praise_count_sum_30d STRING,
	 item__kv_day_h_click_count_max_30d STRING,
	 item__kv_category_click_count_max_30d STRING,
	 item__kv_day_h_praise_count_max_30d STRING,
	 item__kv_category_praise_count_max_30d STRING,
	 item__kv_day_h_click_count_min_30d STRING,
	 item__kv_category_click_count_min_30d STRING,
	 item__kv_day_h_praise_count_min_30d STRING,
	 item__kv_category_praise_count_min_30d STRING,
	 item__kv_day_h_click_count_avg_30d STRING,
	 item__kv_category_click_count_avg_30d STRING,
	 item__kv_day_h_praise_count_avg_30d STRING,
	 item__kv_category_praise_count_avg_30d STRING,
	 item__kv_day_h_click_count_sum_45d STRING,
	 item__kv_category_click_count_sum_45d STRING,
	 item__kv_day_h_praise_count_sum_45d STRING,
	 item__kv_category_praise_count_sum_45d STRING,
	 item__kv_day_h_click_count_max_45d STRING,
	 item__kv_category_click_count_max_45d STRING,
	 item__kv_day_h_praise_count_max_45d STRING,
	 item__kv_category_praise_count_max_45d STRING,
	 item__kv_day_h_click_count_min_45d STRING,
	 item__kv_category_click_count_min_45d STRING,
	 item__kv_day_h_praise_count_min_45d STRING,
	 item__kv_category_praise_count_min_45d STRING,
	 item__kv_day_h_click_count_avg_45d STRING,
	 item__kv_category_click_count_avg_45d STRING,
	 item__kv_day_h_praise_count_avg_45d STRING,
	 item__kv_category_praise_count_avg_45d STRING,
	 gender STRING,
	 age BIGINT,
	 city STRING,
	 item_cnt BIGINT,
	 follow_cnt BIGINT,
	 follower_cnt BIGINT,
	 is_new_user BIGINT,
	 tags STRING,
	 user__sum_playtime_3d DOUBLE,
	 user__sum_duration_3d DOUBLE,
	 user__sum_click_count_3d BIGINT,
	 user__sum_praise_count_3d BIGINT,
	 user__max_playtime_3d DOUBLE,
	 user__max_duration_3d DOUBLE,
	 user__max_click_count_3d BIGINT,
	 user__max_praise_count_3d BIGINT,
	 user__min_playtime_3d DOUBLE,
	 user__min_duration_3d DOUBLE,
	 user__min_click_count_3d BIGINT,
	 user__min_praise_count_3d BIGINT,
	 user__avg_playtime_3d DOUBLE,
	 user__avg_duration_3d DOUBLE,
	 user__avg_click_count_3d DOUBLE,
	 user__avg_praise_count_3d DOUBLE,
	 user__sum_playtime_7d DOUBLE,
	 user__sum_duration_7d DOUBLE,
	 user__sum_click_count_7d BIGINT,
	 user__sum_praise_count_7d BIGINT,
	 user__max_playtime_7d DOUBLE,
	 user__max_duration_7d DOUBLE,
	 user__max_click_count_7d BIGINT,
	 user__max_praise_count_7d BIGINT,
	 user__min_playtime_7d DOUBLE,
	 user__min_duration_7d DOUBLE,
	 user__min_click_count_7d BIGINT,
	 user__min_praise_count_7d BIGINT,
	 user__avg_playtime_7d DOUBLE,
	 user__avg_duration_7d DOUBLE,
	 user__avg_click_count_7d DOUBLE,
	 user__avg_praise_count_7d DOUBLE,
	 user__sum_playtime_15d DOUBLE,
	 user__sum_duration_15d DOUBLE,
	 user__sum_click_count_15d BIGINT,
	 user__sum_praise_count_15d BIGINT,
	 user__max_playtime_15d DOUBLE,
	 user__max_duration_15d DOUBLE,
	 user__max_click_count_15d BIGINT,
	 user__max_praise_count_15d BIGINT,
	 user__min_playtime_15d DOUBLE,
	 user__min_duration_15d DOUBLE,
	 user__min_click_count_15d BIGINT,
	 user__min_praise_count_15d BIGINT,
	 user__avg_playtime_15d DOUBLE,
	 user__avg_duration_15d DOUBLE,
	 user__avg_click_count_15d DOUBLE,
	 user__avg_praise_count_15d DOUBLE,
	 user__kv_day_h_duration_sum_7d STRING,
	 user__kv_category_duration_sum_7d STRING,
	 user__kv_day_h_click_count_sum_7d STRING,
	 user__kv_category_click_count_sum_7d STRING,
	 user__kv_day_h_duration_max_7d STRING,
	 user__kv_category_duration_max_7d STRING,
	 user__kv_day_h_click_count_max_7d STRING,
	 user__kv_category_click_count_max_7d STRING,
	 user__kv_day_h_duration_min_7d STRING,
	 user__kv_category_duration_min_7d STRING,
	 user__kv_day_h_click_count_min_7d STRING,
	 user__kv_category_click_count_min_7d STRING,
	 user__kv_day_h_duration_avg_7d STRING,
	 user__kv_category_duration_avg_7d STRING,
	 user__kv_day_h_click_count_avg_7d STRING,
	 user__kv_category_click_count_avg_7d STRING,
	 user__kv_day_h_duration_sum_15d STRING,
	 user__kv_category_duration_sum_15d STRING,
	 user__kv_day_h_click_count_sum_15d STRING,
	 user__kv_category_click_count_sum_15d STRING,
	 user__kv_day_h_duration_max_15d STRING,
	 user__kv_category_duration_max_15d STRING,
	 user__kv_day_h_click_count_max_15d STRING,
	 user__kv_category_click_count_max_15d STRING,
	 user__kv_day_h_duration_min_15d STRING,
	 user__kv_category_duration_min_15d STRING,
	 user__kv_day_h_click_count_min_15d STRING,
	 user__kv_category_click_count_min_15d STRING,
	 user__kv_day_h_duration_avg_15d STRING,
	 user__kv_category_duration_avg_15d STRING,
	 user__kv_day_h_click_count_avg_15d STRING,
	 user__kv_category_click_count_avg_15d STRING,
	 user__kv_day_h_duration_sum_30d STRING,
	 user__kv_category_duration_sum_30d STRING,
	 user__kv_day_h_click_count_sum_30d STRING,
	 user__kv_category_click_count_sum_30d STRING,
	 user__kv_day_h_duration_max_30d STRING,
	 user__kv_category_duration_max_30d STRING,
	 user__kv_day_h_click_count_max_30d STRING,
	 user__kv_category_click_count_max_30d STRING,
	 user__kv_day_h_duration_min_30d STRING,
	 user__kv_category_duration_min_30d STRING,
	 user__kv_day_h_click_count_min_30d STRING,
	 user__kv_category_click_count_min_30d STRING,
	 user__kv_day_h_duration_avg_30d STRING,
	 user__kv_category_duration_avg_30d STRING,
	 user__kv_day_h_click_count_avg_30d STRING,
	 user__kv_category_click_count_avg_30d STRING,
	 user__kv_day_h_duration_sum_45d STRING,
	 user__kv_category_duration_sum_45d STRING,
	 user__kv_day_h_click_count_sum_45d STRING,
	 user__kv_category_click_count_sum_45d STRING,
	 user__kv_day_h_duration_max_45d STRING,
	 user__kv_category_duration_max_45d STRING,
	 user__kv_day_h_click_count_max_45d STRING,
	 user__kv_category_click_count_max_45d STRING,
	 user__kv_day_h_duration_min_45d STRING,
	 user__kv_category_duration_min_45d STRING,
	 user__kv_day_h_click_count_min_45d STRING,
	 user__kv_category_click_count_min_45d STRING,
	 user__kv_day_h_duration_avg_45d STRING,
	 user__kv_category_duration_avg_45d STRING,
	 user__kv_day_h_click_count_avg_45d STRING,
	 user__kv_category_click_count_avg_45d STRING
) 
PARTITIONED BY (ds STRING)
LIFECYCLE 90;

insert overwrite table fs_demo_fs_engineering_v1_training_set partition (ds = '${bdp.system.bizdate}')
select 
sq0.request_id,
sq0.user_id,
sq0.page,
sq0.net_type,
sq0.day_h,
sq0.week_day,
sq0.day_min,
sq0.event_unix_time,
sq0.item_id,
sq0.playtime,
sq0.is_click,
sq0.ln_playtime,
sq0.is_praise,
sq2.duration,
sq2.category,
sq2.author,
sq2.click_count,
sq2.praise_count,
sq2.is_new_item,
sq5.item__sum_item_cnt_3d,
sq5.item__sum_follow_cnt_3d,
sq5.item__sum_follower_cnt_3d,
sq5.item__max_item_cnt_3d,
sq5.item__max_follow_cnt_3d,
sq5.item__max_follower_cnt_3d,
sq5.item__min_item_cnt_3d,
sq5.item__min_follow_cnt_3d,
sq5.item__min_follower_cnt_3d,
sq5.item__avg_item_cnt_3d,
sq5.item__avg_follow_cnt_3d,
sq5.item__avg_follower_cnt_3d,
sq5.item__sum_item_cnt_7d,
sq5.item__sum_follow_cnt_7d,
sq5.item__sum_follower_cnt_7d,
sq5.item__max_item_cnt_7d,
sq5.item__max_follow_cnt_7d,
sq5.item__max_follower_cnt_7d,
sq5.item__min_item_cnt_7d,
sq5.item__min_follow_cnt_7d,
sq5.item__min_follower_cnt_7d,
sq5.item__avg_item_cnt_7d,
sq5.item__avg_follow_cnt_7d,
sq5.item__avg_follower_cnt_7d,
sq5.item__sum_item_cnt_15d,
sq5.item__sum_follow_cnt_15d,
sq5.item__sum_follower_cnt_15d,
sq5.item__max_item_cnt_15d,
sq5.item__max_follow_cnt_15d,
sq5.item__max_follower_cnt_15d,
sq5.item__min_item_cnt_15d,
sq5.item__min_follow_cnt_15d,
sq5.item__min_follower_cnt_15d,
sq5.item__avg_item_cnt_15d,
sq5.item__avg_follow_cnt_15d,
sq5.item__avg_follower_cnt_15d,
sq6.item__kv_day_h_click_count_sum_7d,
sq6.item__kv_category_click_count_sum_7d,
sq6.item__kv_day_h_praise_count_sum_7d,
sq6.item__kv_category_praise_count_sum_7d,
sq6.item__kv_day_h_click_count_max_7d,
sq6.item__kv_category_click_count_max_7d,
sq6.item__kv_day_h_praise_count_max_7d,
sq6.item__kv_category_praise_count_max_7d,
sq6.item__kv_day_h_click_count_min_7d,
sq6.item__kv_category_click_count_min_7d,
sq6.item__kv_day_h_praise_count_min_7d,
sq6.item__kv_category_praise_count_min_7d,
sq6.item__kv_day_h_click_count_avg_7d,
sq6.item__kv_category_click_count_avg_7d,
sq6.item__kv_day_h_praise_count_avg_7d,
sq6.item__kv_category_praise_count_avg_7d,
sq6.item__kv_day_h_click_count_sum_15d,
sq6.item__kv_category_click_count_sum_15d,
sq6.item__kv_day_h_praise_count_sum_15d,
sq6.item__kv_category_praise_count_sum_15d,
sq6.item__kv_day_h_click_count_max_15d,
sq6.item__kv_category_click_count_max_15d,
sq6.item__kv_day_h_praise_count_max_15d,
sq6.item__kv_category_praise_count_max_15d,
sq6.item__kv_day_h_click_count_min_15d,
sq6.item__kv_category_click_count_min_15d,
sq6.item__kv_day_h_praise_count_min_15d,
sq6.item__kv_category_praise_count_min_15d,
sq6.item__kv_day_h_click_count_avg_15d,
sq6.item__kv_category_click_count_avg_15d,
sq6.item__kv_day_h_praise_count_avg_15d,
sq6.item__kv_category_praise_count_avg_15d,
sq6.item__kv_day_h_click_count_sum_30d,
sq6.item__kv_category_click_count_sum_30d,
sq6.item__kv_day_h_praise_count_sum_30d,
sq6.item__kv_category_praise_count_sum_30d,
sq6.item__kv_day_h_click_count_max_30d,
sq6.item__kv_category_click_count_max_30d,
sq6.item__kv_day_h_praise_count_max_30d,
sq6.item__kv_category_praise_count_max_30d,
sq6.item__kv_day_h_click_count_min_30d,
sq6.item__kv_category_click_count_min_30d,
sq6.item__kv_day_h_praise_count_min_30d,
sq6.item__kv_category_praise_count_min_30d,
sq6.item__kv_day_h_click_count_avg_30d,
sq6.item__kv_category_click_count_avg_30d,
sq6.item__kv_day_h_praise_count_avg_30d,
sq6.item__kv_category_praise_count_avg_30d,
sq6.item__kv_day_h_click_count_sum_45d,
sq6.item__kv_category_click_count_sum_45d,
sq6.item__kv_day_h_praise_count_sum_45d,
sq6.item__kv_category_praise_count_sum_45d,
sq6.item__kv_day_h_click_count_max_45d,
sq6.item__kv_category_click_count_max_45d,
sq6.item__kv_day_h_praise_count_max_45d,
sq6.item__kv_category_praise_count_max_45d,
sq6.item__kv_day_h_click_count_min_45d,
sq6.item__kv_category_click_count_min_45d,
sq6.item__kv_day_h_praise_count_min_45d,
sq6.item__kv_category_praise_count_min_45d,
sq6.item__kv_day_h_click_count_avg_45d,
sq6.item__kv_category_click_count_avg_45d,
sq6.item__kv_day_h_praise_count_avg_45d,
sq6.item__kv_category_praise_count_avg_45d,
sq1.gender,
sq1.age,
sq1.city,
sq1.item_cnt,
sq1.follow_cnt,
sq1.follower_cnt,
sq1.is_new_user,
sq1.tags,
sq3.user__sum_playtime_3d,
sq3.user__sum_duration_3d,
sq3.user__sum_click_count_3d,
sq3.user__sum_praise_count_3d,
sq3.user__max_playtime_3d,
sq3.user__max_duration_3d,
sq3.user__max_click_count_3d,
sq3.user__max_praise_count_3d,
sq3.user__min_playtime_3d,
sq3.user__min_duration_3d,
sq3.user__min_click_count_3d,
sq3.user__min_praise_count_3d,
sq3.user__avg_playtime_3d,
sq3.user__avg_duration_3d,
sq3.user__avg_click_count_3d,
sq3.user__avg_praise_count_3d,
sq3.user__sum_playtime_7d,
sq3.user__sum_duration_7d,
sq3.user__sum_click_count_7d,
sq3.user__sum_praise_count_7d,
sq3.user__max_playtime_7d,
sq3.user__max_duration_7d,
sq3.user__max_click_count_7d,
sq3.user__max_praise_count_7d,
sq3.user__min_playtime_7d,
sq3.user__min_duration_7d,
sq3.user__min_click_count_7d,
sq3.user__min_praise_count_7d,
sq3.user__avg_playtime_7d,
sq3.user__avg_duration_7d,
sq3.user__avg_click_count_7d,
sq3.user__avg_praise_count_7d,
sq3.user__sum_playtime_15d,
sq3.user__sum_duration_15d,
sq3.user__sum_click_count_15d,
sq3.user__sum_praise_count_15d,
sq3.user__max_playtime_15d,
sq3.user__max_duration_15d,
sq3.user__max_click_count_15d,
sq3.user__max_praise_count_15d,
sq3.user__min_playtime_15d,
sq3.user__min_duration_15d,
sq3.user__min_click_count_15d,
sq3.user__min_praise_count_15d,
sq3.user__avg_playtime_15d,
sq3.user__avg_duration_15d,
sq3.user__avg_click_count_15d,
sq3.user__avg_praise_count_15d,
sq4.user__kv_day_h_duration_sum_7d,
sq4.user__kv_category_duration_sum_7d,
sq4.user__kv_day_h_click_count_sum_7d,
sq4.user__kv_category_click_count_sum_7d,
sq4.user__kv_day_h_duration_max_7d,
sq4.user__kv_category_duration_max_7d,
sq4.user__kv_day_h_click_count_max_7d,
sq4.user__kv_category_click_count_max_7d,
sq4.user__kv_day_h_duration_min_7d,
sq4.user__kv_category_duration_min_7d,
sq4.user__kv_day_h_click_count_min_7d,
sq4.user__kv_category_click_count_min_7d,
sq4.user__kv_day_h_duration_avg_7d,
sq4.user__kv_category_duration_avg_7d,
sq4.user__kv_day_h_click_count_avg_7d,
sq4.user__kv_category_click_count_avg_7d,
sq4.user__kv_day_h_duration_sum_15d,
sq4.user__kv_category_duration_sum_15d,
sq4.user__kv_day_h_click_count_sum_15d,
sq4.user__kv_category_click_count_sum_15d,
sq4.user__kv_day_h_duration_max_15d,
sq4.user__kv_category_duration_max_15d,
sq4.user__kv_day_h_click_count_max_15d,
sq4.user__kv_category_click_count_max_15d,
sq4.user__kv_day_h_duration_min_15d,
sq4.user__kv_category_duration_min_15d,
sq4.user__kv_day_h_click_count_min_15d,
sq4.user__kv_category_click_count_min_15d,
sq4.user__kv_day_h_duration_avg_15d,
sq4.user__kv_category_duration_avg_15d,
sq4.user__kv_day_h_click_count_avg_15d,
sq4.user__kv_category_click_count_avg_15d,
sq4.user__kv_day_h_duration_sum_30d,
sq4.user__kv_category_duration_sum_30d,
sq4.user__kv_day_h_click_count_sum_30d,
sq4.user__kv_category_click_count_sum_30d,
sq4.user__kv_day_h_duration_max_30d,
sq4.user__kv_category_duration_max_30d,
sq4.user__kv_day_h_click_count_max_30d,
sq4.user__kv_category_click_count_max_30d,
sq4.user__kv_day_h_duration_min_30d,
sq4.user__kv_category_duration_min_30d,
sq4.user__kv_day_h_click_count_min_30d,
sq4.user__kv_category_click_count_min_30d,
sq4.user__kv_day_h_duration_avg_30d,
sq4.user__kv_category_duration_avg_30d,
sq4.user__kv_day_h_click_count_avg_30d,
sq4.user__kv_category_click_count_avg_30d,
sq4.user__kv_day_h_duration_sum_45d,
sq4.user__kv_category_duration_sum_45d,
sq4.user__kv_day_h_click_count_sum_45d,
sq4.user__kv_category_click_count_sum_45d,
sq4.user__kv_day_h_duration_max_45d,
sq4.user__kv_category_duration_max_45d,
sq4.user__kv_day_h_click_count_max_45d,
sq4.user__kv_category_click_count_max_45d,
sq4.user__kv_day_h_duration_min_45d,
sq4.user__kv_category_duration_min_45d,
sq4.user__kv_day_h_click_count_min_45d,
sq4.user__kv_category_click_count_min_45d,
sq4.user__kv_day_h_duration_avg_45d,
sq4.user__kv_category_duration_avg_45d,
sq4.user__kv_day_h_click_count_avg_45d,
sq4.user__kv_category_click_count_avg_45d
from 
(
	select *
	from rec_sln_demo_fs_label_table_v1
	where ds = '${bdp.system.bizdate}'
) sq0
 left join (
	select 
	*
	from <project_name>.rec_sln_demo_user_table_preprocess_v1
	where ds = TO_CHAR(DATEADD(TO_DATE('${bdp.system.bizdate}','yyyymmdd'), - 1,'dd'),'yyyymmdd')
) sq1 on sq0.user_id = sq1.user_id
 left join (
	select 
	*
	from <project_name>.rec_sln_demo_item_table_preprocess_v1
	where ds = TO_CHAR(DATEADD(TO_DATE('${bdp.system.bizdate}','yyyymmdd'), - 1,'dd'),'yyyymmdd')
) sq2 on sq0.item_id = sq2.item_id
 left join (
	select 
	*
	from <project_name>.rec_sln_demo_user_table_preprocess_win_v1
	where ds = TO_CHAR(DATEADD(TO_DATE('${bdp.system.bizdate}','yyyymmdd'), - 1,'dd'),'yyyymmdd')
) sq4 on sq0.user_id = sq4.user_id
 left join (
	select 
	*
	from <project_name>.rec_sln_demo_item_table_preprocess_agg_v1
	where ds = TO_CHAR(DATEADD(TO_DATE('${bdp.system.bizdate}','yyyymmdd'), - 1,'dd'),'yyyymmdd')
) sq5 on sq0.item_id = sq5.item_id
 left join (
	select 
	*
	from <project_name>.rec_sln_demo_item_table_preprocess_win_v1
	where ds = TO_CHAR(DATEADD(TO_DATE('${bdp.system.bizdate}','yyyymmdd'), - 1,'dd'),'yyyymmdd')
) sq6 on sq0.item_id = sq6.item_id
 left join (
	select 
	*
	from <project_name>.rec_sln_demo_user_table_preprocess_agg_v1
	where ds = TO_CHAR(DATEADD(TO_DATE('${bdp.system.bizdate}','yyyymmdd'), - 1,'dd'),'yyyymmdd')
) sq3 on sq0.user_id = sq3.user_id
;

Dalam pernyataan SQL sebelumnya, Anda harus mengganti <project_name> dengan nama proyek yang sebenarnya.

Referensi

  • Untuk informasi lebih lanjut tentang cara menghasilkan fitur, lihat Platform Fitur dan Rekayasa Fitur.

  • FeatureStore cocok untuk semua skenario yang memerlukan fitur, seperti skenario rekomendasi, kontrol risiko keuangan, dan skenario pertumbuhan pengguna. FeatureStore terintegrasi dengan mesin sumber data umum dan mesin layanan rekomendasi Alibaba Cloud. FeatureStore menyediakan platform end-to-end yang efisien dan nyaman dari pendaftaran fitur hingga pengembangan model dan aplikasi. Untuk informasi lebih lanjut tentang FeatureStore, lihat Ikhtisar.

  • Jika Anda memiliki pertanyaan saat menggunakan FeatureStore, Anda dapat bergabung dengan grup DingTalk 34415007523 untuk bantuan teknis.