This topic describes how to use a PyODPS node in DataWorks to reference a third-party package by depending on common Python scripts and open source third-party packages.
Depend on common Python scripts
Depend on an open source third-party package
If you want to depend on an open source third-party package, you must use pip to install
the package. In addition, the following requirements must be met:
- Use an exclusive resource group for scheduling. For more information, see Add an exclusive resource group for scheduling.
- Install the required third-party package in O&M Assistant of the exclusive resource
group for scheduling. For more information, see O&M Assistant. PyODPS nodes include PyODPS 2 nodes and PyODPS 3 nodes.
- If a PyODPS 2 node is depended on, run the following command:
pip install <Package to be installed> -i https://pypi.tuna.tsinghua.edu.cn/simple
If you are prompted to upgrade pip after you run the preceding command, run the following command:pip install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple
- If a PyODPS 3 node is depended on, run the following command:
/home/tops/bin/pip3 install <Package to be installed> -i https://pypi.tuna.tsinghua.edu.cn/simple
If you are prompted to upgrade pip after you run the preceding command, run the following command:/home/tops/bin/pip3 install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple
If the following error is reported when you use the PyODPS 3 node, submit a ticket to apply for permissions:"/home/admin/usertools/tools/cmd-0.sh: line 3: /home/tops/bin/python3: The file or directory does not exist."
- If a PyODPS 2 node is depended on, run the following command: