Data Analysis Skill
功能描述
对结构化数据进行统计分析,自动检测异常值,计算相关性,并提供可视化建议。
适用场景:
- •数据探索性分析 (EDA)
- •业务指标分析
- •异常检测
- •报表生成
使用示例
输入:
json
{
"data": [
{"date": "2024-01-01", "revenue": 1000, "users": 50},
{"date": "2024-01-02", "revenue": 1200, "users": 55},
{"date": "2024-01-03", "revenue": 950, "users": 48},
{"date": "2024-01-04", "revenue": 1500, "users": 70},
{"date": "2024-01-05", "revenue": 1100, "users": 52}
],
"columns": ["revenue", "users"],
"analysis_types": ["descriptive", "correlation", "outliers"]
}
输出:
json
{
"statistics": {
"revenue": {
"count": 5,
"mean": 1150,
"std": 202.48,
"min": 950,
"max": 1500,
"median": 1100
},
"users": {
"count": 5,
"mean": 55,
"std": 8.37,
"min": 48,
"max": 70,
"median": 52
}
},
"correlations": {
"revenue_users": {
"coefficient": 0.96,
"interpretation": "strong_positive"
}
},
"outliers": [
{"column": "users", "value": 70, "index": 3, "z_score": 1.79}
],
"visualization_suggestions": [
{"type": "line_chart", "columns": ["date", "revenue"], "reason": "时序数据适合折线图"},
{"type": "scatter_plot", "columns": ["users", "revenue"], "reason": "强相关性适合散点图"}
],
"insights": [
"revenue 与 users 呈强正相关 (r=0.96)",
"2024-01-04 的 users 值 (70) 偏高,可能为异常值或特殊事件"
]
}
注意事项
- •数据量限制 10000 条记录
- •仅支持数值型列的统计分析
- •相关性分析需要至少 2 列数值数据
- •异常值检测基于 Z-score (|Z| > 1.5 标记为潜在异常)