×
Community Blog How to Run Python in DataWorks and MaxCompute

How to Run Python in DataWorks and MaxCompute

This article describes how to run Python in DataWorks and MaxCompute.

By Haoran Wang, Sr. Big Data Solution Architect of Alibaba Cloud

1. How to Create a PyODPS Job?

1

Note: PyODPS2 is Python2 script. PyODPS3 is Python3 script.

You need to use Exclusive Resource Group, if you want to add some python library, eg. Using PIP to import some library

2

2. How to Use PIP to Import Python Library in DataWorks?

Purchase Exclusive Resource Group for Scheduling

3

After that find out the resource group and click O&M Assistant

4

Click Create Command

5

6

/home/tops/bin/pip3 install neo4j

Then Create and then Run

7

After it is running successfully, the driver is imported.

3. How to Use PyODPS to Write Job into MaxCompute

3.1 Write data via DataFrame

https://www.alibabacloud.com/help/en/maxcompute/latest/execution
https://pyodps.readthedocs.io/zh_CN/latest/df-basic.html#odps

from odps.df import DataFrame
from odps.df import output

df = DataFrame(o.get_table('ods_p2p'))
df.persist('ods_p2p_out', partition='dt=20230307', drop_partition=True, create_partition=True)

3.2 Write Data via List

https://www.alibabacloud.com/help/en/maxcompute/latest/tables

records = [[111, 1.0],                 # A list can be specified.
          [222, 2.0],
          [333, 3.0],
          [444, 4.0]]
o.write_table('my_new_table', records, partition='pt=test', create_partition=True)  # Create a partition named test and write data to the partition.

3.3 Write Data Record by Record

https://www.alibabacloud.com/help/en/maxcompute/latest/tables

t = o.get_table('my_new_table')
with t.open_writer(partition='pt=test02', create_partition=True) as writer:  # Create a partition named test02 and write data to the partition.
    records = [[1, 1.0],                 # A list can be specified.
              [2, 2.0],
              [3, 3.0],
              [4, 4.0]]
    writer.write(records)  # Records can be iterable objects.

4. Appendix

4.1 How to Read PipeDrive

/home/tops/bin/pip3 install pipedrive-python-lib

from pipedrive.client import Client

client = Client(domain='https://XXX.pipedrive.com/')
client.set_api_token('xxx')

ft_cols = ['_fivetran_start' , '_fivetran_active', '_fivetran_end', '_fivetran_synced', '_fivetran_deleted']
deals = [] 
more_item = True
start = 0
print(f'getting pipedrive - data!!!!')
while more_item == True:
    dl = client.deals.get_all_deals({'start':start})
    if dl['success'] == True:
        deals.append(dl)
        more_item = dl['additional_data']['pagination']['more_items_in_collection']
        try:
            start = dl['additional_data']['pagination']['next_start']
            print(start)
        except:
            pass
    else :
        print('Get deal error, retrying!!!!')
        
print('getting data - Done!!!!')    

print('transforming data!!!!')   

4.2 How to Read Zendesk

/home/tops/bin/pip3 install zenpy

from zenpy import Zenpy

zenpy_client = Zenpy(subdomain = 'xxx',
                    email = 'xxx', password = 'xxx')

tkts = zenpy_client.search_export(type='ticket')
print(tkts)

4.3 How to Transfer from Json to List

import requests
import json
import pandas as pd
from pandas.io.json import json_normalize

r = requests.get('http://www.starcapital.de/test/Res_Stockmarketvaluation_FundamentalKZ_Tbl.php')

a = json.loads(r.text)

res = json_normalize(a)
##print(res)

df = pd.DataFrame(res)
print(df)

##df = pd.read_json(a)
##print(df)

4.4 How to Read Amplitude

import json
import requests
import base64

start = '20230305T01'
end = '20230305T02'
api_key=''
secret_key = ''

sample_string = api_key + ":" + secret_key
# sample_string_bytes = sample_string.encode('utf8')
# base64_bytes = base64.b64encode(sample_string_bytes)
# base64_string = base64_bytes.decode('utf8')

request_url = 'https://amplitude.com/api/2/export?start={}&end={}'.format(start, end)
headers = {
    "Authorization": "Basic {}".format(sample_string)
}

headers = {"Content-Type": "application/json", "Accept": "application/json"}

r = requests.get(request_url, params={}, headers = headers, auth = (api_key, secret_key))

print(r)

https://community.amplitude.com/data-instrumentation-57/cannot-read-file-in-python-from-export-api-due-to-decoding-issue-1930

0 1 0
Share on

Farruh

29 posts | 16 followers

You may also like

Comments