edit-icon download-icon

Step 1: Data preparation

Last Updated: Feb 24, 2018

The data in this example comes from the product data information table “house_basic_info” of a resale house website, which is stored in RDS-MySQL (region: Zone A in South China 1 of Alibaba Cloud; network: VPC). The table data is updated daily.

Note:

You can directly use the Product data information table of a resale house website in DTplus Public Dataset - Second-hand House Dataset, but the data volume may be slightly different from what this example offers.

The data is described as follows:

Field Field type Description
house_id varchar House ID
house_city varchar The city where the house is located
house_total_price Double Total price of the house
house_unit_price Double Unit price of the house
house_type varchar House type
house_floor varchar Floor of the house
house_direction varchar House orientation
house_deckoration varchar House decoration
house_area Double House area
house_community_name varchar Name of the community the house belongs to
house_region varchar The region where the house is located
proj_name varchar Project name
proj_addr varchar Project address
period int Property right period
property varchar Property management company
greening_rate varchar Greening rate
property_costs varchar Property management fee
datetime varchar Date of the data

Data sample (separated by comma):

  1. 000404705c6add1dc08e54ba10720698, Beijing, 8,000,000, 72,717, 3 bedrooms and 1 living room, lower floor/24 floors, South, flat/decorated, 137, Ximeng Liyuan, Caoqiao of Fengtai, between the Third and the Fourth Ring Roads, null, null, null, null, null, null, 20170605

The table creation statements for the table house_basic_info on RDS-MySQL are as follows:

  1. CREATE TABLE `house_basic_info` (
  2. `house_id` varchar(1024) NOT NULL COMMENT 'House ID',
  3. `house_city` varchar(1024) NULL COMMENT 'The city where the house is located',
  4. `house_total_price` double NULL COMMENT 'Total price of the house',
  5. `house_unit_price` double NULL COMMENT 'Unit price of the house',
  6. `house_type` varchar(1024) NULL COMMENT 'House type',
  7. `house_floor` varchar(1024) NULL COMMENT 'Floor of the house',
  8. `house_direction` varchar(1024) NULL COMMENT 'House direction',
  9. `house_deckoration` varchar(512) NULL COMMENT 'House decoration',
  10. `house_area` double NULL COMMENT 'House area',
  11. `house_community_name` varchar(1024) NULL COMMENT 'Name of the community the house belongs to',
  12. `house_region` varchar(1024) NULL COMMENT 'The region where the house is located',
  13. `proj_name` varchar(1024) NULL,
  14. `proj_addr` varchar(1024) NULL,
  15. `period` int(11) NULL,
  16. `property` varchar(1024) NULL,
  17. `greening_rate` varchar(1024) NULL,
  18. `property_costs` varchar(1024) NULL,
  19. `datetime` varchar(512) NULL COMMENT 'Date of the data'
  20. ) ENGINE=InnoDB
  21. DEFAULT CHARACTER SET=utf8 COLLATE=utf8_general_ci
  22. COMMENT='Product data information table of a resale house website';

What to do next

Now you have learned how to prepare the data required for experiments. The article explains how to configure the RDS data sources required for experiments later. For more information, see Configure RDS data sources.

Thank you! We've received your feedback.