A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
csvkit 包中。
-bash: csvlook: command not found #通过 pip 安装 sudo pip install csvkit #Debian apt-get install csvkit #Ubuntu apt-get install csvkit #Kali Linux apt-get install csvkit #Fedora dnf install python3-csvkit #OS X brew install csvkit #Raspbian apt-get install python3-csvkit
csvlook 以 Markdown 兼容的固定宽度格式将 CSV 呈现到命令行。
csvlook [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
[-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-L LOCALE]
[-S] [--blanks] [--date-format DATE_FORMAT]
[--datetime-format DATETIME_FORMAT] [-H] [-K SKIP_LINES] [-v]
[-l] [--zero] [-V] [--max-rows MAX_ROWS]
[--max-columns MAX_COLUMNS]
[--max-column-width MAX_COLUMN_WIDTH] [-y SNIFF_LIMIT] [-I]
[FILE]
csvkit 的工具共享一组常用的命令行参数。并非每种工具都支持所有参数,因此请使用带有 --help 标志的工具检查哪些参数支持:
-d DELIMITER, --delimiter DELIMITER
Delimiting character of the input CSV file.
-t, --tabs Specify that the input CSV file is delimited with
tabs. Overrides "-d".
-q QUOTECHAR, --quotechar QUOTECHAR
Character used to quote strings in the input CSV file.
-u {0,1,2,3}, --quoting {0,1,2,3}
Quoting style used in the input CSV file. 0 = Quote
Minimal, 1 = Quote All, 2 = Quote Non-numeric, 3 =
Quote None.
-b, --no-doublequote Whether or not double quotes are doubled in the input
CSV file.
-p ESCAPECHAR, --escapechar ESCAPECHAR
Character used to escape the delimiter if --quoting 3
("Quote None") is specified and to escape the
QUOTECHAR if --no-doublequote is specified.
-z FIELD_SIZE_LIMIT, --maxfieldsize FIELD_SIZE_LIMIT
Maximum length of a single field in the input CSV
file.
-e ENCODING, --encoding ENCODING
Specify the encoding of the input CSV file.
-L LOCALE, --locale LOCALE
Specify the locale (en_US) of any formatted numbers.
-S, --skipinitialspace
Ignore whitespace immediately following the delimiter.
--blanks Do not coerce empty, "na", "n/a", "none", "null", "."
strings to NULL values.
--date-format DATE_FORMAT
Specify a strptime date format string like "%m/%d/%Y".
--datetime-format DATETIME_FORMAT
Specify a strptime datetime format string like
"%m/%d/%Y %I:%M %p".
-H, --no-header-row Specify that the input CSV file has no header row.
Will create default headers (a,b,c,...).
-K SKIP_LINES, --skip-lines SKIP_LINES
Specify the number of initial lines to skip before the
header row (e.g. comments, copyright notices, empty
rows).
-v, --verbose Print detailed tracebacks when errors occur.
-l, --linenumbers Insert a column of line numbers at the front of the
output. Useful when piping to grep or as a simple
primary key.
--zero When interpreting or displaying column numbers, use
zero-based numbering instead of the default 1-based
numbering.
-V, --version Display version information and exit.
csv 文件
使用 csvlook 查看 csv 文件数据:
csvlook data.csv
csvlook 的输出看起来并不是很清爽,有可能会看到数据格式比较乱(竖线字符和破折号)。这是因为该数据集有很多列,并且它们不能一次全部放入终端中。这时候有两种方式解决:
# 1. 将输出通过管道传输到以显示行而无需换行,并使用箭头键左右滚动:less -S csvlook data.csv | less -S # 2. 在查看数据集之前,请减少其显示的列,使用 csvcut $ csvcut -n data.csv 1: state 2: county 3: fips 4: nsn 5: item_name 6: quantity 7: ui 8: acquisition_cost 9: total_cost 10: ship_date 11: federal_supply_category 12: federal_supply_category_name 13: federal_supply_class 14: federal_supply_class_name #如上,我们的数据有14列。现在,我们只想取列2、5、6,命令行如下: $ csvcut -c 2,5,6 data.csv #此时 CSV 输出,只有3列,还可以通过名称来引用列: $ csvcut -c county,item_name,quantity data.csv