Exercise
- Take this short interactive regex tutorial.
 
The tutorial is great for anyone. To help us understand the material better, we can do those problems after the tutorial.
- Find the number of words (in 
/usr/share/dict/words) that contain at
least threeas and don’t have a'sending. What are the three
most common last two letters of those words?sed‘sycommand, or
thetrprogram, may help you with case insensitivity. How many
of those two-letter combinations are there? And for a challenge:
which combinations do not occur?1
2
3
4cat /usr/share/dict/words | \
grep -E ".*([Aa].*){3,}[^Ss]$" | \
grep -o -e '.\{2\}$' | \
sort | uniq -c | sort -nr | head -n3 
Challenge (ugly but works…)
1  | # show what combinations are not included  | 
- To do in-place substitution it is quite tempting to do something like
sed s/REGEX/SUBSTITUTION/ input.txt > input.txt. However this is a
bad idea, why? Is this particular tosed? Useman sedto find out
how to accomplish this. 
It is bad because if your command have any bug, the input file will be distrupted. Here is the option in the man sed
1  | -i[SUFFIX], --in-place[=SUFFIX]  | 
Find your average, median, and max system boot time over the last ten
boots. Usejournalctlon Linux andlog showon macOS, and look
for log timestamps near the beginning and end of each boot. On Linux,
they may look something like:1
Logs begin at ...
and
1
systemd[577]: Startup finished in ...
On macOS, look
for:1
=== system boot:
and
1
Previous shutdown cause: 5
1
journalctl | grep ".*\[1\]: Startup finished in" | sed -E 's/.*Startup finished in.* = (.*)\.$/\1/' | sed s/min//g | sed s/s//g # right now cannot deal with the case '1min 51.581s'
Look for boot messages that are not shared between your past three
reboots (seejournalctl‘s-bflag). Break this task down into
multiple steps. First, find a way to get just the logs from the past
three boots. There may be an applicable flag on the tool you use to
extract the boot logs, or you can usesed '0,/STRING/d'to remove
all lines previous to one that matchesSTRING. Next, remove any
parts of the line that always varies (like the timestamp). Then,
de-duplicate the input lines and keep a count of each one (uniqis
your friend). And finally, eliminate any line whose count is 3 (since
it was shared among all the boots).Find an online data set like this
one, this
one.
or maybe one from
here.
Fetch it usingcurland extract out just two columns of numerical
data. If you’re fetching HTML data,pupmight be helpful. For JSON
data, tryjq. Find the min and
max of one column in a single command, and the sum of the difference
between the two columns in another.