Global PowerPlant Database - Extraction
Wanted to get the latest list of all the electrical powerplants in the USA and thanks to the World Resources Institutes Global PowerPlant Database source. They do a great service in collecting and collating the various datasets available into a single updated source and all their methods and code is open sourced on GitHub
First get some helper tools
sudo apt-get install cargo csvkit busybox lynx;
cargo install xsv;- Line 1:
sudo apt-get install cargo csvkit busybox;Gets the Rust package manager cargo, a nice csv manipulator tool csvkit and the packaged utility set busybox and finally get the excellent original text browser lynx to extract and manipulate html - Line 2: Install the xsv tool from the Rust cargo package registry
Add this line to your .bashrc file if you don't have it already for Rust cargo binaries downloaded
export PATH="$HOME/.cargo/bin:$PATH"Now I can download, extract, and convert
wget -qO- $(lynx -listonly -nonumbers -dump http://datasets.wri.org/dataset | grep "globalpowerplantdatabasev") | busybox unzip -;
rm -v *.txt *.pdf;
xsv search -s country 'USA' global_power_plant_database.csv | xsv select name,capacity_mw,latitude,longitude,primary_fuel | xsv slice -o USA.csv
csvsql --db sqlite:///USA.sqlite --tables powerplants --insert USA.csv;
rm *.csv;- Line 1:
wget -qO- $(lynx -listonly -nonumbers -dump http://datasets.wri.org/dataset | grep "globalpowerplantdatabasev") | busybox unzip -;Find the dataset link with lynx, then get the zip file, and finally extract it Using the wget -q quiet and O- Output file to stream options here instead of file - Line 2:
rm -v *.txt *.pdf;Delete the unnecessary files - Line 3:
xsv search -s country 'USA' global_power_plant_database.csv | xsv select name,capacity_mw,latitude,longitude,primary_fuel | xsv slice -o USA.csvExtract the columns from the globalpowerplantsdabase name,capacity_mw,latitude,longitude,primary_fuel and save as
csvsql --db sqlite:///USA.sqlite --tables powerplants --insert USA.csv;USA.csv - Line 4:
csvsql --db sqlite:///USA.sqlite --tables powerplants --insert USA.csv;Convert theUSA.csvfile into a sqlite database calledUSA.sqlite - Line 5:
rm *.csv;Cleanup the*.csvfiles
Complete script
sudo apt-get install cargo csvkit busybox;
cargo install xsv;
PATH=$PATH:~/.cargo/bin;
wget -qO- $(lynx -listonly -nonumbers -dump http://datasets.wri.org/dataset | grep "globalpowerplantdatabasev") | busybox unzip -;
rm -v *.txt *.pdf;
xsv search -s country 'USA' global_power_plant_database.csv | xsv select name,capacity_mw,latitude,longitude,primary_fuel | xsv slice -o USA.csv
csvsql --db sqlite:///USA.sqlite --tables powerplants --insert USA.csv;
rm *.csv;