A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
- -bash/zsh: csvstat command not found
- # Windows (WSL2)
- sudo apt-get update sudo apt-get install csvkit
- # Debian
- apt-get install csvkit
- # Ubuntu
- apt-get install csvkit
- # Kali Linux
- apt-get install csvkit
- # Fedora
- dnf install python3-csvkit
- # OS X
- brew install csvkit
- # Raspbian
- apt-get install python3-csvkit
- # Dockerfile
- dockerfile.run/csvstat
csvstat 命令可以打印 CSV 文件中所有列的描述性统计信息。将智能地确定每列的类型,然后打印与该类型相关的分析(日期的范围、整数的平均值和中位数等):
- csvstat [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
- [-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-L LOCALE]
- [-S] [--blanks] [--null-value NULL_VALUES [NULL_VALUES ...]]
- [--date-format DATE_FORMAT] [--datetime-format DATETIME_FORMAT]
- [-H] [-K SKIP_LINES] [-v] [-l] [--zero] [-V] [--csv] [--json]
- [-i INDENT] [-n] [-c COLUMNS] [--type] [--nulls] [--non-nulls]
- [--unique] [--min] [--max] [--sum] [--mean] [--median]
- [--stdev] [--len] [--max-precision] [--freq]
- [--freq-count FREQ_COUNT] [--count]
- [--decimal-format DECIMAL_FORMAT] [-G] [-y SNIFF_LIMIT] [-I]
- [FILE]
- FILE The CSV file to operate on.
- If omitted, will accept input as piped data via STDIN.
- -h, --help show this help message and exit
- --csv Output results as a CSV table, rather than plain text.
- --json Output results as JSON text, rather than plain text.
- -i INDENT, --indent INDENT
- Indent the output JSON this many spaces. Disabled by
- default.
- -n, --names Display column names and indices from the input CSV
- and exit.
- -c COLUMNS, --columns COLUMNS
- A comma-separated list of column indices, names or
- ranges to be examined, e.g. "1,id,3-5". Defaults to
- all columns.
- --type Only output data type.
- --nulls Only output whether columns contains nulls.
- --non-nulls Only output counts of non-null values.
- --unique Only output counts of unique values.
- --min Only output smallest values.
- --max Only output largest values.
- --sum Only output sums.
- --mean Only output means.
- --median Only output medians.
- --stdev Only output standard deviations.
- --len Only output the length of the longest values.
- --max-precision Only output the most decimal places.
- --freq Only output lists of frequent values.
- --freq-count FREQ_COUNT
- The maximum number of frequent values to display.
- --count Only output total row count.
- --decimal-format DECIMAL_FORMAT
- %-format specification for printing decimal numbers.
- Defaults to locale-specific formatting with "%.3f".
- -G, --no-grouping-separator
- Do not use grouping separators in decimal numbers.
- -y SNIFF_LIMIT, --snifflimit SNIFF_LIMIT
- Limit CSV dialect sniffing to the specified number of
- bytes. Specify "0" to disable sniffing entirely, or
- "-1" to sniff the entire file.
- -I, --no-inference Disable type inference when parsing the input. Disable
- reformatting of values.
csv 显示所有列的所有统计数据:
- csvstat data.csv
当传递统计数据名称给到 csvstat 时,只会打印该统计数据:
- csvstat --min examples/realdata/FY09_EDU_Recipients_by_State.csv
- 1. State Name: None
- 2. State Abbreviate: None
- 3. Code: 1
- 4. Montgomery GI Bill-Active Duty: 435
- 5. Montgomery GI Bill- Selective Reserve: 48
- 6. Dependents' Educational Assistance: 118
- 7. Reserve Educational Assistance Program: 60
- 8. Post-Vietnam Era Veteran's Educational Assistance Program: 1
- 9. TOTAL: 768
- 10. j: None
csvstat 如果请求单个统计数据和单个列,则只会返回一个值:
- csvstat -c 4 --mean examples/realdata/FY09_EDU_Recipients_by_State.csv
- 6,263.904
- # Show all stats for columns 2 and 4:
- csvstat -c 2,4 data.csv
csvstat 显示所有列的总和:
- csvstat --sum data.csv
csv 显示第 3 列的最大值长度:
- csvstat -c 3 --len data.csv
csvstat 显示 name
列中唯一值的数量:
- csvstat -c name --unique data.csv