Variables¶
Learning objectives
- use a variable
For teachers
Teaching goals are:
- Learners have practiced using a variable
Lesson plan:
- 5 mins: prior knowledge
- 5 mins: presentation
- 15 mins: challenge
- 5 mins: feedback
Overview¶
Sometimes we need AWK to remember things, for example, when summing values
Exercises¶
See the exercise procedure here.
Exercise 1: confirming things are true¶
Learning objectives
- experience variables
Download the data¶
In a terminal, do:
to download a file called coins.txt
.
Sum¶
In a terminal, in the same folder as where the data is downloaded, do:
In English, this is: 'For every line, add the value of the second column
to the variable called sum
. In the end, print the value of sum.
Confirm that this is true.
Get the maximum¶
In a terminal, in the same folder as where the data is downloaded, do:
In English, this is: 'For every line, if the value in the second column
is bigger than the value of max
, set max
to the value in the second column.
In the end, print the value of max
.
Confirm that this is true.
What is the assumption about the values in the second column?
The assumption about the values in the second column is that zero
is the lowest possible value, as the initial value of max
is zero.
If, for example, all the values in the second column are negative,
max
will be zero.
Exercise 2: explore data¶
Download the data¶
In a terminal, do:
wget https://raw.githubusercontent.com/richelbilderbeek/awk_course/master/data/diamonds_no_header.tsv
to download a file called diamonds_no_header.tsv
.
This file is a tab-separated file about diamonds and
is part of the ggplot2
R package.
The dataset has the following columns:
index | name | description |
---|---|---|
1 | carat | weight of the diamond (0.2--5.01) |
2 | cut | quality of the cut (Fair, Good, Very Good, Premium, Ideal) |
3 | color | diamond colour, from D (best) to J (worst) |
4 | clarity | a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best)) |
5 | depth | total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43--79) |
6 | table | width of top of diamond relative to widest point (43--95) |
7 | price | price in US dollars (\(326--\)18,823) |
8 | x | length in mm (0--10.74) |
9 | y | width in mm (0--58.9) |
10 | z | depth in mm (0--31.8) |
Explore the data¶
Using awk
only:
- show the number of diamonds in the dataset
- show the number of columns in the dataset
- show the lowest price of all the diamonds
- show the average price of all the diamonds
- show the average price of all the diamonds with an ideal cut
- show the total weight of all diamonds combined in kilos, 1 carat is 0.0002 kilo
- show the prices of all the diamonds together, in Swedish kroner. Assume 1 US dollar is 10.47 Swedish kroner