[UNIX CLI] Command Line(2)

3 minute read


UNIX COMMAND LINE

  • UNIX CLI 정리(2)
  • cut을 통해 텍스트 형식 파일의 열을 선택해서 볼 수 있음
  • grep을 통해 텍스트 형식 파일에서 특정 문자 패턴만 filter해서 볼 수 있음
  • >을 통해 명령어를 통해 산출된 결과를 저장(redirection)할 수 있음
  • |을 통해 | 왼쪽에 있는 명령어의 결과를 오른쪽에서 사용할 수 있음
  • wc을 통해 characters, words, lines의 수를 확인할 수 있음
  • wildcard를 통해 여러 file을 한 번에 지정할 수 있음
    • * - matches zero or more characters
    • ? - matches a single character
    • [...] - matches any one of the characters inside the square brackets
    • {...} - matches any of the comma-separated patterns inside the curly brackets
  • sort을 통해 데이터를 정렬할 수 있음
  • uniq을 통해 중복 데이터(인접 데이터 간만)를 distinct 시킬 수 있음

cut

  • -f (fields)를 통해 열을 지정
  • -d (delimiter)를 통해 separator 지정
# select the first 3 column from the file spring.csv
cut -f 1-3 -d , spring.csv

grep

  • -c : print a count of matching lines rather than the lines themselves
  • -h : do not print the names of files when searching multiple files
  • -i : ignore case (e.g., treat “Regression” and “regression” as matches)
  • -l : print the names of files that contain matches, not the matches
  • -n : print line numbers for matching lines
  • -v : invert the match, i.e., only show lines that don’t match
# Print the contents of all of the lines containing the word molar in seasonal/autumn.csv
grep molar seasonal/autumn.csv

# Print all the lines that don't contain the word molar in seasonal/autum.csv & show their line numbers
grep -v -n molar seasonal/autumn.csv

# Count how many lines contain the word incisor in autumn.csv and winter.csv combined
grep -c incisor seasonal/autumn.csv seasonal/winter.csv

>

  • >는 그 자체로 command의 option이 아님, 다만 command로 산출된 결과를 저장하도록 함
  • >는 모든 명령어를 입력한 후 마지막에 쓰는 것이 일반적이나 맨 앞에 쓰는 것도 가능, 중간에 사용하는 것은 안됨
# save iris head 20 file as iris2
head -20 iris.csv > iris2.csv

|

# select the 3rd-5th row in summer.csv
head -n 5 summer.csv | tail -n 3

# select the 2nd column of summer.csv where 'Tooth' is missing, and then select the first row
cut -d , -f 2 summer.csv | grep -v Tooth | head -n 1

wc

  • -c : number of characters
  • -w : number of words
  • -l : number of lines
# Count how many records in seasonal/spring.csv have dates in July 2017 (2017-07)
cut -f 1 -d , seasonal/spring.csv | grep 2017-07 | wc -l

specifing multiple files

  • 띄어쓰기를 통해 2개 이상의 파일을 명령어를 적용할 수 있음
  • * : matches zero or more characters
  • ? : matches a single character
    • 201?.txt matches 2017.txt, 2018.txt but not 2017-01.txt
  • [...] : matches any one of the characters inside the square brackets
    • 201[78].txt matches 2017.txt and 2017.txt but not 2016.txt
  • {...} : matches any of the comma-separated patterns inside the curly brackets
    • {*.txt, *.csv} matches files that endsw with .txt and .csv but not .png
# Get the first three lines from both spring.csv and summer.csv, but not autumn.csv and winter.csv
head -n 3 s*

sort

  • -n : sort numerically
  • -r : reverse the order of its output
  • -b : ignore leading blanks
  • -f : fold case (i.e., be case-insensitive)
# sort the 5th(species) column of iris in descending order
cut -f 5 -d , iris.csv | sort -n -r

uniq

  • -c : display unique lines with a count of how often each occurs
    • uniq은 인접한 경우에만 중복 데이터를 제거하기에 uniqe한 경우만 산출하고 싶은 경우 sort와 함께 사용하면 됨
# count the number of uniqe(distinct) species of iris
cut -f 5 -d , iris.csv | uniq -c
  • Get the second column from seasonal/winter.csv
  • Remove the word “Tooth” from the output so that only tooth names are displayed
  • Sort the output so that all occurrences of a particular tooth name are adjacent
  • Display each tooth name once along with a count of how often it occurs
cut -f 2 -d , seasonal/winter.csv | grep -v Tooth | sort -n | uniq -c