[UNIX CLI] Command Line(2)
          
  
    
    
    
      
      
      
        
        
          3 minute read
        
      
    
  
        
      
      
        
          
        
        
UNIX COMMAND LINE
  - UNIX CLI 정리(2)
 
  cut을 통해 텍스트 형식 파일의 열을 선택해서 볼 수 있음 
  grep을 통해 텍스트 형식 파일에서 특정 문자 패턴만 filter해서 볼 수 있음 
  >을 통해 명령어를 통해 산출된 결과를 저장(redirection)할 수 있음 
  |을 통해 | 왼쪽에 있는 명령어의 결과를 오른쪽에서 사용할 수 있음 
  wc을 통해 characters, words, lines의 수를 확인할 수 있음 
  - wildcard를 통해 여러 file을 한 번에 지정할 수 있음
    
      * - matches zero or more characters 
      ? - matches a single character 
      [...] - matches any one of the characters inside the square brackets 
      {...} - matches any of the comma-separated patterns inside the curly brackets 
    
   
  sort을 통해 데이터를 정렬할 수 있음 
  uniq을 통해 중복 데이터(인접 데이터 간만)를 distinct 시킬 수 있음 
cut
  - -f (fields)를 통해 열을 지정
 
  - -d (delimiter)를 통해 separator 지정
 
# select the first 3 column from the file spring.csv
cut -f 1-3 -d , spring.csv
 
grep
  - -c : print a count of matching lines rather than the lines themselves
 
  - -h : do not print the names of files when searching multiple files
 
  - -i : ignore case (e.g., treat “Regression” and “regression” as matches)
 
  - -l : print the names of files that contain matches, not the matches
 
  - -n : print line numbers for matching lines
 
  - -v : invert the match, i.e., only show lines that don’t match
 
# Print the contents of all of the lines containing the word molar in seasonal/autumn.csv
grep molar seasonal/autumn.csv
# Print all the lines that don't contain the word molar in seasonal/autum.csv & show their line numbers
grep -v -n molar seasonal/autumn.csv
# Count how many lines contain the word incisor in autumn.csv and winter.csv combined
grep -c incisor seasonal/autumn.csv seasonal/winter.csv
 
>
  >는 그 자체로 command의 option이 아님, 다만 command로 산출된 결과를 저장하도록 함 
  >는 모든 명령어를 입력한 후 마지막에 쓰는 것이 일반적이나 맨 앞에 쓰는 것도 가능, 중간에 사용하는 것은 안됨 
# save iris head 20 file as iris2
head -20 iris.csv > iris2.csv
 
|
# select the 3rd-5th row in summer.csv
head -n 5 summer.csv | tail -n 3
# select the 2nd column of summer.csv where 'Tooth' is missing, and then select the first row
cut -d , -f 2 summer.csv | grep -v Tooth | head -n 1
 
wc
  - -c : number of characters
 
  - -w : number of words
 
  - -l : number of lines
 
# Count how many records in seasonal/spring.csv have dates in July 2017 (2017-07)
cut -f 1 -d , seasonal/spring.csv | grep 2017-07 | wc -l
 
specifing multiple files
  - 띄어쓰기를 통해 2개 이상의 파일을 명령어를 적용할 수 있음
 
  * : matches zero or more characters 
  ? : matches a single character
    
      201?.txt matches 2017.txt, 2018.txt but not 2017-01.txt 
    
   
  [...] : matches any one of the characters inside the square brackets
    
      201[78].txt matches 2017.txt and 2017.txt but not 2016.txt 
    
   
  {...} : matches any of the comma-separated patterns inside the curly brackets
    
      - {*.txt, *.csv} matches files that endsw with 
.txt and .csv but not .png 
    
   
# Get the first three lines from both spring.csv and summer.csv, but not autumn.csv and winter.csv
head -n 3 s*
 
sort
  - -n : sort numerically
 
  - -r : reverse the order of its output
 
  - -b : ignore leading blanks
 
  - -f : fold case (i.e., be case-insensitive)
 
# sort the 5th(species) column of iris in descending order
cut -f 5 -d , iris.csv | sort -n -r
 
uniq
  - -c : display unique lines with a count of how often each occurs
    
      - uniq은 인접한 경우에만 중복 데이터를 제거하기에 uniqe한 경우만 산출하고 싶은 경우 sort와 함께 사용하면 됨
 
    
   
# count the number of uniqe(distinct) species of iris
cut -f 5 -d , iris.csv | uniq -c
 
  - Get the second column from seasonal/winter.csv
 
  - Remove the word “Tooth” from the output so that only tooth names are displayed
 
  - Sort the output so that all occurrences of a particular tooth name are adjacent
 
  - Display each tooth name once along with a count of how often it occurs
 
cut -f 2 -d , seasonal/winter.csv | grep -v Tooth | sort -n | uniq -c