CSV Formatting Skill
When to Use
Use this skill when creating, cleaning, or exporting CSV files from scraped data, database queries, or any structured output.
File Standards
Encoding & Format
- •UTF-8 encoding (no BOM)
- •Unix line endings (LF, not CRLF)
- •
.csvextension
Headers
- •First row is always headers
- •Lowercase with underscores:
company_name,job_title,date_posted - •No spaces, no special characters
- •Short but descriptive
Delimiters
- •Use comma
,as default delimiter - •If data contains many commas, consider tab-delimited (
.tsv) - •Never mix delimiters in one file
Data Formatting
Text Fields
- •Wrap in double quotes if contains: comma, newline, or double quote
- •Escape double quotes by doubling:
"She said ""hello""" - •Trim leading/trailing whitespace
- •Empty string for missing values (not "N/A", "null", "None", or "-")
Numbers
- •No currency symbols:
49.99not$49.99 - •No thousand separators:
1000000not1,000,000 - •Use period for decimals:
3.14 - •Empty for missing (not
0unless zero is meaningful)
Dates
- •ISO 8601 format:
2025-01-13 - •Include time if relevant:
2025-01-13T14:30:00 - •Use UTC when timezone matters
- •Empty for missing dates (not "TBD" or "N/A")
URLs
- •Full URL with protocol:
https://example.com - •No trailing slashes unless meaningful
- •URL-encode special characters if needed
Boolean Values
- •Use lowercase:
true,false - •Or:
yes,no - •Pick one convention per file
- •Empty for unknown (not "maybe")
Lists Within Cells
- •Pipe-delimited:
python|javascript|sql - •Or semicolon:
python;javascript;sql - •Never comma (conflicts with CSV delimiter)
- •Document the convention in the first row or filename
Quality Checks
Before delivering a CSV:
- •Open in text editor — verify no weird characters or encoding issues
- •Check row count — matches expected data?
- •Verify headers — all lowercase, no spaces?
- •Spot check values — dates formatted? numbers clean?
- •No trailing empty rows — delete blank lines at end
- •Consistent columns — every row has same number of fields?
Common Mistakes to Avoid
| Wrong | Right |
|---|---|
Company Name | company_name |
$1,234.56 | 1234.56 |
01/13/2025 | 2025-01-13 |
N/A | `` (empty) |
TRUE | true |
"hello" (curly quotes) | "hello" (straight quotes) |
Python Reference
python
import csv
# Writing
with open('output.csv', 'w', newline='', encoding='utf-8') as f:
writer = csv.DictWriter(f, fieldnames=['company_name', 'website', 'price'])
writer.writeheader()
writer.writerows(data)
# Reading
with open('input.csv', 'r', encoding='utf-8') as f:
reader = csv.DictReader(f)
data = list(reader)
Naming Convention
code
{topic}_{date}.csv
Examples:
competitor_pricing_2025-01-13.csv
job_listings_2025-01-13.csv
scraped_companies_2025-01-13.csv
Delivery Checklist
- • UTF-8 encoded
- • Headers in first row, lowercase with underscores
- • No empty rows at end
- • Dates in ISO format
- • Numbers without formatting
- • Empty strings for missing values
- • Filename includes topic and date