Jul 09, 2017

Day-to-day: working at the command line

Using the command line to navigate a multitude of tasks, experience will not quote the sources the ability was gained from. Probably from manual pages, project documentation, questions answered by the likes of stackoverflow and walkthroughs in tutorials. When you have some basic commands, ways for combination and the muscle memory for parameters, a lot of tasks and questions can be quickly resolved in an interactive way. A flexibility that is the reward for the learning curve needed.

All this happens in a shell and there exists valid criticism for inconsistencies in its usage and also projects that try to take a fresh approach. In a interview with the creator of the Powershell you can find a bit of discussion on it at minute 20. The new alternative shells are notably oil and fish. I haven't got an opinion on this, normal bash hasn't bitten me yet, but I sometimes see the history involved.

Mustering the persistence to work through a book or even a guide without a concrete itch to scratch is hard (for me). I learn best while solving an immediate problem. My goal with writing this overview is painting a mental picture for the uninitiated, so one can attribute a command to its category and do common tasks: "How do I.. move efficiently in the terminal? Search for and within files? Edit them? Get to know the system? Talk to the network? Retrieve and work with structured data? Use archives and create backups? Mind security and have a versatile and easy to setup workspace? Keep track of text-based changes?"

at the prompt

Use ctrl + cursor keys ← → to move between word boundaries. Combined with +w will delete whole words, +a hop to the line start, +e to the end. +r will backward search formerly entered commands (in ~/.bash_history) - I'm a heavy user of this. You'll maybe need a fix to cycle forward again with +s. +l will clear the visible buffer. +u clear the line (no need to use ctrl + c for an empty line) . PageUp/Down with shift will scroll in the terminal. Use the autocomplete feature with TAB and hop back to former directory paths with cd -. Forgot to sudo? do sudo !! and bash will pull the last command from history and run it with sudo. There are a lot more built-ins and you can even change to vi-mode bindings.

search for and within files

There is grep -rni <searchterm>(?) and filter by filename with find . -name '*.py' -print0 | xargs -0 grep <term>. Some modern grep replacements (ag, ripgrep) can by default ignore cvs-directories. You can use inverted grep -v to filter undesired lines. As always, sort -n, uniq -c and wc -l will answer statistical questions, either on accesslogs or expected results. With grep -E or -P regex can be used. The -o flag is useful for value extraction. Often do I prepare complex ones at regex101. find can combine criteria and execute commands on its results: find /data/movies/ -name '*.mp4' -or -name '*.mkv' -and -mtime +30 -and -size +1G -delete(?) or instead -exec rm -f {} \;. Adding -ls is also useful.

exiting VIM without a bitter taste

:q! is the no-questions asked exit. With i or INSERT-key you enter the thus named editing-mode, :w lets you write your changes. u is for undo and ctrl+r for redoing changes. ESC will leave whatever mode you are in. Individual keys set the type - and in combination with following keys - the context of the manipulation. yyp copies ("yank" in vimspeak) and pastes a line, dd will delete it. D and C delete character/words, C will enter into insert mode after deletion. Pressing ci" deletes within two quote characters and enters insert mode - very convenient. ca" will rid you of the quotes too. Visual mode v combined with y or x copies/cuts a selected region over with p again. As in less, / is for search. :set ignorecase for case- insensitive search. Pressing # marks a term in your buffer, n is for next hit. Option :set hlsearch will highlight them, :syntax on for general syntax highlighting. Getting or removing line numbering is :set nu or nonu - shortform for :set (no)number, jumping to line number is :42. Directives for tab/spacing have similar shortcuts. :set expandtab will use spaces instead of tab at write-time. :set tabspace=4 will introduce sane spacing, as does :set ts=4. Having opened a file without proper permissions, :w !sudo tee % will save you time. Using VIM with an interpreter? bind a shortcut to running the file with autocmd. Search & Replace works with :%s/old/new/g. The % of the "substitute" command defines if it applies to every line, or with :s/ just the current one. Also reducing it to a visually selected area is possible. The last section is to configure to replace all finds /g or only first, be case sensitive /i or not. To show lines matching a term or pattern, use :g/<term>/p1 or its inverted :!g/... When editing around html tags, (capture groups) transfer the match to the result %s/<a href=\"\([^"]*\)\>/<a href=\"\1\.html/g(?). If you like the concept, VIM is a "modal editor" and there are video tutorials that hint at the ways you can come to appreciate that. Also local user groups exist to advance your knowledge with it, as does the "evil mode" for Emacs to reconcile adjacent editor camps.

system interaction

Logging into a struggling host, I check uptime as it shows /proc/loadavg(?) with less keystrokes. nproc will relate the load averages to available processors. top or htop may hint at the resource hugging processes. Seeing how long a process has run is also attainable via selected attributes ps -eo pid,etime,cmd. After disk free df -h and df -i, du or ncdu will show disk usage by directory/file, summarized by du -csmh(?). ncdu can write its findings into a database file if any further fast querying (with jq) is neccessary. Attaching to a process with strace -p<pid> will give you more insight what the process is up to. Outputting into a logfile helps too. gdb can create backtraces for segfaulting processes. lsof can list open files and ports, ps has a multitude of options that can be used to see process state.pgrep in conjunction with pkill helps when looking for the right process(groups) to terminate or send other posix signals to (yes, this is a thing). Start with selecting only values you want to see: ps -eo pid,etime,lstart,cmd to get a feel for what's available.

The blog of Julia Evans is a great ressource for system debugging / curiosity. Generally the knowledgebases of VPS providers(1,2) give you a quick overview on what is needed for a new system service before you have to dive into the project documentation.

The day I started to use / and -<option flag>+space for searching in man pages, they got way more accessible. I've come to appreciate bespoke EXAMPLES sections. If these are missing, checkout cheat here and eg(?). In the "systems we love" conference 2016 recording you can find a talk on man.

basic networking and curl

dig -t a <host with +noall +answer in your ~/.digrc gives you a oneliner of TTL and ip. You can use it with dig -t ANY <host> @ to have an overview. Use with any Resource Record. dig -x <ip> will give you the ptr record / hostname, a whois <ip> the owner of the netblock. -t soa <domain> I need often to determine the answering DNS server. mtr reveals network hops neccessary, packet loss and latencies. ip addr, route -n, arp -n and netstat -ntap(?) give insight of routing and ports. iptables -L is for seeing filtered ports. ss and nftables are upcoming contenders. Most of the time, I need to see http response headers with curl -sI <host>, adding headers -H 'Host: subdomain.example.com', -H 'Accept-Encoding: gzip' or -u 'user:pass' is handy. Firefox will copy over complete curl options from the network tab to your shell to continue debugging (Chrome is very useful at chrome://net-internals, too). curl combined with a cronjob can output continous measurements in a file to be plotted by your favourite package.

coming to terms with the mysql shell

.. is done by setting a custom pager and reading in the password for the mysql user (silently). Respawning the shell after you (accidentially) drop out is quick: read -s pw and then mysql -u <dbuser> -p$pw <dbname>. Works too when manipulating the line to go from mysql to mysqldump or mytop. You only have to enter it once. Does the password variable have security repercussions? Not ones I'm aware of - ps will not expand upon the password and show it to other logged-in users. Another shortcut is recursively searching for the password-grep in your bash history with ctrl+r when you log-in. If you use the full path to the secrets/config file you could craft a pipe to the variable. Once at the prompt, use pager less -SniFX;(?) to let a bigger-than-window query result not auto-scroll and overwhelm you. Gives you the search capability of less too. Put the pager definition in a (chmod 600!) ~/.my.cnf file below the [mysql] section and user+password below [client]. Another bummer is if you happen to end up on a host with mysql-client (--version) compiled against libedit instead of readline, then ctrl+w will delete the whole line before the cursor instead of words. Put bind "^W" ed-delete-prev-word in ~/.editrc and you'll be relieved.

There are different techniques to get csv-formatted data ad-hoc in and out of a mysql database. Something I use often is the batch mode of mysql to process the data on the cli with sed and pipes: mysql -B -N -e "select id,user,email;" | sed 's/\t/;/g' > users.csv(?). Another is LOAD DATA INFILE .. from the prompt. Reading up on when you likely prefer the options that mysqldump --opt or --skip-opt provide is useful. --opt is for backup and swift restore. --skip-extended-insert is handy for editing the INSERTs and re-apply. I mostly dump to a gzip pipe, giving the file a date in a subshell: mysqldump <dbname> | gzip > $(date --iso)-dbname.sql.gz. Reapplying in a oneliner is gunzip < dbname.sql.gz | mysql <dbname>.

I haven't got much experience with Postgres, but when I had a use-case that involved filtering keys, its JSON functions, even compared with MySQL 8, could accomplish what I had to shell out for to jq in MySQL. pgloader was care-free to transfer the dataset then.

I consider CSV, XML and JSON to be very widespread. The q util lets you do ordinary SQL on a CSV file, mostly needed with q -H -t <file> flags. Going from a .csv to a sqlite DB is quick too and offers further benefits when used in a sql notebook. With xmlstarlet and xmlstarlet el -u <file> you can assess the basic structure of a XML file, count elements or filter parent nodes that miss a specific property. jq and jmespath help with JSON filtering. For very basic prettyprint, piping to python -m json.tool or json_pp is available on most systems. Recently pup came around to filter by css-selector, if you'll ever need to extract from a DOM tree and not reach to dedicated scrapers.

text processing

sed 's/term/replacement/g' replace feature on a stream and with -i inplace option on files. awk is its own language, though I use it mostly for looking at logs awk '{print $7}' (path column in httpd accesslogs) and do basic math on columns. Find more examples at cli text processing. tr can translate (meaning replace/getting rid of) a single or group of characters. I use it as a simpler sed when replacing newlines with spaces tr '\n' ' ', deleting them tr -d '\n' or changing the delimiter in a csv: tr ';' ','.

tarballing, archives and backups

It's tar cfz ball.tar.gz inputdir/ for creation, tar xfz ball.tar.gz for extraction. zip -r file.zip inputdir/ has similar usage pattern. Showing archive contents happens with tar tvf file.tar. The tar command knows --exclude=dir/ patterns (as does rsync btw). Extracting only subdirs is possible too. Sometimes you want to take input from a file, then it's tar cf media.tar -T files.txt. There's the popular "tarpipe" to move the provided files to a new directory: tar cf - -T files.txt| (cd <destdir> && tar xv).

rsync is great for recursive transfers, check what the -a of its "most popular" preserving -avz parameters expands to in explainshell.com(?) and examine what these options do. Do you want to ignore the permissions or the creation/modification times and just check for size, or even only for the checksum? There exist cases for using other combinations then the plain -a.

If you don't use any form of backup yet (remote git repositories somehow count) rsnapshot is a great and proven way for simple incremental backups utilizing rsync and hardlinks, very much recommended as entrypoint to the world of backup.

secure practices

Use your ~/.ssh/config and you'll have a comfortable login-life. Create a password protected public/private keypair with ssh-keygen -t ed25519 -f ~/.ssh/username -C 'comment' and add keys to your ssh-agent at login time. For ad-hoc logins, I seldomly have to go beyond ssh -i ~/.ssh/identityfile -l username <host>. Remember to set generics in the bottom of the ssh_config file, IdentitiesOnly yes, path to IdentityFile ~/.ssh/file or User per domain. You can use arbitary TLDs in assigning settings, and openssh has conf.d/-type support in recent versions too. SSH has escape sequences to adjust the session, can proxy connections, and with sshuttle can help you tunnel traffic per ip prefix. Checkout this article for advanced usage: "SSH can do that?" (2011).

With gnupg you can generate a keypair, export your pubkey and import and sign others public keys. Good for secret sharing when you understood the proper signing process and implied web-of-trust. Remember to generate a revocation key. Criticism at its complexity and lacking forward-security are valid, but it is still a powerful concept. pass(?) in combination with pwgen -s <length> and xclip is great for never using a password twice. It uses a configurable gpg-key. You can step it up one more and use gpg-agent as your ssh-agent, combined with a hardware dongle (1) your opsec level will be raised considerably. And remember, data at rest has to be encrypted, like your disk.

For a manual ssl certificate workflow, I came to create a CSR template and invoke the key generation with openssl req -new -sha256 -nodes -out domain.csr -newkey rsa:2048 -keyout domain.key -config <( cat csr_details.txt ). As gpg, openssl too can encrypt and decrypt files adhoc. I use it too to see the ciphers supported by the SSL host, inspect its certificate expiration date and other properties.

terminal multiplexers

With screen, remember ctrl+a and ESC to switch between modes, the last with PgUp/PgDown to scroll inside the window. To get around, I mostly need ctrl+a and d to detach from session and screen -r when reattaching. Having multiple sessions? screen -l lists them. When inside screen and in command mode ctrl+a, S/| splits the screen horizontally/vertically, TAB changes sides, c creates a new subshell and " lets you choose an existing subshell from a list. n is for next window. I think with the mode-key, c+n and detaching you'll start to like the tool. tmux is more modern and the prefix is ctrl+b instead of +a, " is for vertical split and % for horizontal. Detaching is the same key, though reattaching is tmux attach-session. Cycling through "pane" is +o, +space changes layouts, +c is as well for creating new windows that can be splitas you wish. It is not installed on every host you log into.

settling in

I have an easy reinstall workflow: assembled dotfiles by topic in subdirs are transfered to the new machine, stow -R applies all or some them. It's not a default package though. As with dotfiles, aliases and functions in ~/.bash_aliases and assembled notes (plain textfiles per topic) are testimony to past problems solved. Those are easily searchable. The package installation history of apt-based distribution can give you a list of those packages. Pipe zgrep -h -E '(apt|apt\-get) install' /var/log/apt/history.log* | cut -d' ' -f4- | sed -r 's/^(-f|-y) //g' | tr '\n' ' ' into a post-install.sh script and weed out nonsense. So with PS1='\w \$ ' set, two keyboard shortcuts, some essential packages and authentifications for external services I'm quickly comfortable in a new environment.

version control

Git UI has a commemorating xkcd comic strip. Some people recommend to learn its underlying principles (merkle-trees) to ace the commands. checkout -- <path>, reset --hard <hash> and commit --amend are retreats I use beside the reflog when I have lost orientation. With time, merge, rebase, cherry-pick - and creating lots of branches become comfortable. diff can compare between tags and hashes and also apply as valid patches beside format-patch. Normal patch can work with them too. How to see side-by-side diffs (diff -y+-w) via git, I choose the second option in this guide (sdiff+colordiff). Interactive add -i is very useful when you want to split changes in different commits. The more people are involved and touch on the same files, merge conflict resolution can get interesting. Configure a good 3-way-merge editor. filter-branch helps getting rid of accidential, big file commits - even specialized tools exist. As with bash aliases, you can define such in your ~/.gitconfig. I use it to see when a file or string was first introduced, or what files a specific commit touched upon, a nicer oneline history (hist = log --pretty=format:'%h %ad | %s%d [%an]' --graph --date=short).

  1. this goes back to ed "the standard editor" and predates how grep was born. Watch this interview and go into Q mode in vim to do some time travelling