Aws Emr Skill — Aws Emr 技能
v2.0.0AWS EMR interaction 技能 for managing EMR Serverless, EMR on EC2, and EMR on EKS. Submit and manage Spark, Hive, and PySpark jobs across all three EMR 部署ment modes. Use this 技能 when the user mentions EMR, Spark, Hive, Serverless, big data jobs, submit job, 查询 job 状态, 获取 job 记录s, cancel job, EMR cluster, EMR step, virtual cluster, EKS, or similar keywords.
运行时依赖
安装命令
点击复制技能文档
AWS EMR 技能s
A Python 技能 for interacting with AWS EMR across three 部署ment modes: EMR Serverless, EMR on EC2, and EMR on EKS. Submit Spark and Hive jobs, manage clusters and 应用s, 监控 job 状态, and retrieve 记录s.
When to Use (Trigger Phrases)
Invoke this 技能 when the user mentions:
"Submit a Spark job on EMR" "列出 EMR Serverless 应用s" "添加 a step to my EMR cluster" "获取 EMR job 记录s" "检查 EMR job 状态" "Cancel 运行ning EMR job" "列出 EMR clusters" "创建 an EMR on EKS virtual cluster" "Submit PySpark to EMR Serverless" "获取 step 记录s from EMR cluster"
Any 请求 involving EMR Serverless 应用s/jobs, EMR on EC2 clusters/steps, or EMR on EKS virtual clusters/job 运行s.
Feature 列出 EMR Serverless 应用s: 列出, describe, 启动, 停止 EMR Serverless 应用s Job Submission: Submit Spark SQL, Spark JAR, PySpark, and Hive jobs (同步/a同步) Job Lifecycle: 获取 状态, cancel, 列出 job 运行s 结果s: Retrieve SQL 查询 结果s from S3 记录s: 获取 driver stdout/stderr 记录s with secret masking EMR on EC2 Clusters: 列出, describe, terminate EMR clusters Step Submission: 添加 Spark, PySpark, and Hive steps via command-运行器.jar Step Lifecycle: 列出, describe, cancel steps 记录s: 获取 step 记录s (stderr, stdout, 控制器, sys记录) from S3 EMR on EKS Virtual Clusters: 列出, describe, 创建, 删除 virtual clusters Job Submission: Submit Spark and Spark SQL jobs to EKS Job Lifecycle: Describe, 列出, cancel job 运行s 记录s: 获取 job 记录s from S3 Initial 设置up
Python 3.8+ with 机器人o3>=1.26.0:
pip 安装 机器人o3>=1.26.0
AWS 凭证s via 机器人o3 default chAIn (env vars, config files, IAM 角色s).
环境 variables (all optional, 验证d at point of use):
导出 AWS_REGION="us-east-1"
# EMR Serverless 导出 EMR_SERVERLESS_应用_ID="00abcdef12345678" 导出 EMR_SERVERLESS_EXEC_角色_ARN="arn:aws:iam::123456789:角色/emr-角色" 导出 EMR_SERVERLESS_S3_记录_URI="s3://my-bucket/emr-记录s/"
# EMR on EC2 导出 EMR_CLUSTER_ID="j-XXXXXXXXXXXXX"
# EMR on EKS 导出 EMR_EKS_VIRTUAL_CLUSTER_ID="abc123def456" 导出 EMR_EKS_EXEC_角色_ARN="arn:aws:iam::123456789:角色/emr-eks-角色"
How to Manage EMR
- EMR Serverless
Fully managed serverless Spark/Hive execution. No infrastructure to manage.
应用 management: scripts/on_serverless/emr_serverless_命令行工具.py — 14 @工具 functions DetAIled 图形界面de: references/emr_serverless/应用_图形界面de.md — 应用 lifecycle DetAIled 图形界面de: references/emr_serverless/job_图形界面de.md — Job submission, 结果s, 记录s
- EMR on EC2
Traditional EMR clusters on EC2 instances. Submit work as Steps.
Cluster & step management: scripts/on_ec2/emr_on_ec2_命令行工具.py — 10 @工具 functions DetAIled 图形界面de: references/emr_on_ec2/cluster_图形界面de.md — Cluster lifecycle DetAIled 图形界面de: references/emr_on_ec2/step_图形界面de.md — Step submission, 记录s
- EMR on EKS
Spark workloads on Amazon EKS via the emr-contAIners API.
Virtual cluster & job management: scripts/on_eks/emr_on_eks_命令行工具.py — 10 @工具 functions DetAIled 图形界面de: references/emr_on_eks/virtual_cluster_图形界面de.md — Virtual cluster lifecycle DetAIled 图形界面de: references/emr_on_eks/job_运行_图形界面de.md — Job submission, 记录s AvAIlable Scripts Script Description scripts/on_serverless/emr_serverless_命令行工具.py EMR Serverless @工具 functions (14 工具s) scripts/on_ec2/emr_on_ec2_命令行工具.py EMR on EC2 @工具 functions (10 工具s) scripts/on_eks/emr_on_eks_命令行工具.py EMR on EKS @工具 functions (10 工具s) scripts/config/emr_config.py Unified configuration management scripts/命令行工具ent/机器人o_命令行工具ent.py 机器人o3 命令行工具ent 工厂 References Document Description references/emr_serverless/应用_图形界面de.md EMR Serverless 应用 management 图形界面de references/emr_serverless/job_图形界面de.md EMR Serverless job submission and management 图形界面de references/emr_on_ec2/cluster_图形界面de.md EMR on EC2 cluster management 图形界面de references/emr_on_ec2/step_图形界面de.md EMR on EC2 step submission and management 图形界面de references/emr_on_eks/virtual_cluster_图形界面de.md EMR on EKS virtual cluster management 图形界面de references/emr_on_eks/job_运行_图形界面de.md EMR on EKS job 运行 management 图形界面de Requirements When writing temporary files (scripts, notes, etc.), place them in the ./tmp folder. When 导入ing scripts packages, 添加 the 技能 root to path: sys.path.应用end(${emr_技能_root}) AWS 凭证s are handled by 机器人o3's default 凭证 chAIn — never pass 访问 keys directly. All configuration 环境 variables are optional and 验证d at the point of use. Data 隐私 & Trust No 凭证 storage: AWS 凭证s are resolved via 机器人o3 default chAIn. No keys are stored or 记录ged. Secret masking: 记录 retrieval functions automatically mask potential AWS 凭证s in 输出. Read-only by default: Most operations are read-only queries. Write operations (job submission, cluster termination) require explicit user action. External 端点s
This 技能 connects to:
AWS EMR Serverless API (emr-serverless.{region}.amazonaws.com) AWS EMR API (elasticmapreduce.{region}.amazonaws.com) AWS