Supporting realtime data feeds

Currently I spend a non significant amount of time supporting 3rd party public API data feeds that supply satellite imagery through various formats and delivery mechanisms of the following

As you can see the variety of protocols and formats means a lot can go wrong and often there are interruptions which is not that major an issue when you wish to make forecasts for the day, but can be a major issue when you want to make near realtime or accurate modelled predictions every 5 minutes.  As you can anticipate the need for such high fidelity information at that clip causes hiccups and when it does these are the tools I reach for when debugging bash scripts which call a variety of language programs python, matlab, C#, etc

  • grep - The handy do all when combined with other utilities piping their input on for filtering i.e.
ps command ps -aux | grep "python"
  • ps/ls/ll - Process list and file list are used regularly and most often with grep
  • lsof - Find what files are open most often used to identify the process that has opened a file.  Important when a pid file appears to be locked
  • iotop - Find where the I/O bottlenecks are on your system for clearing and optimization, also if something appears stuck see what this returns
  • htop - Interactive and helpful replacement for the limited top command