CSV Data Explorer — CSV Data 资源管理器
v1.0.0Explore, 过滤器, summarize, and visualize CSV data directly in terminal with interactive queries.
运行时依赖
安装命令
点击复制技能文档
CSV Data 资源管理器 What This Does
A 命令行工具 工具 to explore, analyze, and visualize CSV data directly from the terminal. Load CSV files, 过滤器 rows, calculate statistics, 生成 summaries, and 创建 basic 可视化s without leaving your terminal.
Key features:
Load and preview CSV files with automatic delimiter 检测ion Explore data structure - view columns, data types, missing values 过滤器 rows based on conditions (e质量, ine质量, contAIns, regex) Select columns - include/exclude specific columns Calculate statistics - mean, median, min, max, standard deviation, percentiles 生成 summaries - count, unique values, frequency distributions Basic 可视化s - histograms, bar 图表s, scatter plots (ASCII or simple terminal 输出) 导出 结果s - 过滤器ed data, statistics, summaries to new CSV/JSON files Interactive mode - step-by-step exploration with prompts Command-line mode - scriptable operations for 自动化 When To Use You need to quickly explore CSV data without opening spreadsheets You want to 过滤器 and analyze data for 报告ing or 调试ging You need to calculate basic statistics on data设置s You're working on servers/remote machines without 图形界面 工具s You want to automate CSV data processing in scripts You need to 分享 analysis 结果s with team members You're teaching data analysis concepts in terminal 环境 Usage
Basic commands:
# Load and preview a CSV file python3 scripts/mAIn.py preview data.csv
# Show basic statistics python3 scripts/mAIn.py stats data.csv
# 过滤器 rows where column 'age' > 30 python3 scripts/mAIn.py 过滤器 data.csv --where "age > 30"
# Select specific columns python3 scripts/mAIn.py select data.csv --columns name,age,salary
# 生成 histogram for a column python3 scripts/mAIn.py histogram data.csv --column age --bins 10
# Count unique values in a column python3 scripts/mAIn.py unique data.csv --column category
# 导出 过滤器ed data python3 scripts/mAIn.py 过滤器 data.csv --where "salary > 50000" --输出 过滤器ed.csv
# Interactive exploration mode python3 scripts/mAIn.py interactive data.csv
Examples Example 1: Preview and basic statistics python3 scripts/mAIn.py preview sales.csv --limit 10
输出:
CSV File: sales.csv (1000 rows × 5 columns)
First 10 rows: ┌─────┬────────────┬───────────┬────────┬───────────┐ │ Row │ Date │ Product │ Amount │ Region │ ├─────┼────────────┼───────────┼────────┼───────────┤ │ 1 │ 2024-01-01 │ Wid获取 A │ 150.50 │ North │ │ 2 │ 2024-01-01 │ Wid获取 B │ 89.99 │ South │ │ ... │ ... │ ... │ ... │ ... │ └─────┴────────────┴───────────┴────────┴───────────┘
Column summary:
- Date: 1000 non-null, type: datetime
- Product: 1000 non-null, type: string (5 unique values)
- Amount: 1000 non-null, type: float (min: 10.00, max: 999.99)
- Region: 1000 non-null, type: string (4 unique values)
Example 2: 过滤器 and calculate statistics python3 scripts/mAIn.py 过滤器 sales.csv --where "Region == 'North' and Amount > 100" --stats
输出:
过滤器ed data: 237 rows (from 1000 total)
Statistics for 过滤器ed data:
- Count: 237
- Mean Amount: 245.67
- Median Amount: 210.50
- Min Amount: 101.00
- Max Amount: 999.99
- Standard Deviation: 145.23
Example 3: 生成 histogram python3 scripts/mAIn.py histogram sales.csv --column Amount --bins 5
输出 (ASCII 应用roximation):
Amount Distribution (5 bins): [10.00 - 207.99] ████████████████████████████ 312 [208.00 - 405.99] ████████████████████ 241 [406.00 - 603.99] ██████████ 152 [604.00 - 801.99] █████ 78 [802.00 - 999.99] ███ 45
Example 4: Interactive mode python3 scripts/mAIn.py interactive sales.csv
Interactive mode 图形界面des you through:
File loading and preview Column selection and 过滤器ing Statistical analysis 可视化 options 导出 结果s Requirements Python 3.x pandas 库 for data manipulation (安装ed automatically or via pip) matplotlib 库 for 可视化s (optional, for enhanced 图表s)
安装 missing dependencies:
pip3 安装 pandas matplotlib
Limitations Large files (>100MB) may be slow to process 可视化s are ASCII-based or simple terminal plots No support for Excel files or other 格式化s (CSV only) Limited to basic statistical functions (not advanced 分析) No support for time series analysis or complex aggregations Memory usage 扩展s with file size No built-in support for database connections No support for 流ing/processing very large data设置s 可视化s limited to terminal capabilities No support for geographic data or maps Limited error handling for malformed CSV files No built-in data 清理ing or trans格式化ion functions Performance may be slower than specialized 工具s like R or specialized libraries Directory Structure
The 工具 works with CSV files in the current directory or specified paths. No special configuration directories are required.
Error Handling Invalid CSV files show helpful error messages with line numbers Missing columns suggest avAIlable colum