Forensic Seismology and Acoustics

Notebook for class on February 17th.

Stephen Arrowsmith (sarrowsmith@smu.edu)

Data Exercise 1: Reading and analyzing DPRK data

This first class introduces you to analyzing seismic data with Python. We're going to dive in the deep end and try to tread water. First thing we'll do is import some 'modules'. These are different code packages that contain different things. We can import whole modules, parts of modules, and rename modules when we import them.

The module called 'forensic' is built for this class. The module called 'obspy' is an important one for analyzing seismic and acoustic data.

In [1]:
import forensic
from obspy import UTCDateTime
from pyproj import Geod
import pandas as pd
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import datetime, pickle

The next thing we'll do is define an event location and time. We're just defining the location and time of the DPRK nuclear test in September 2017, using the values from the Myers et al. (2018) paper that we read already. Defining times can be tricky, ObsPy (the main seismic data module we read in above) likes times in a type of time format called 'UTCDateTime'.

In [2]:
evla = 41.3; evlo = 129.08; evdp = 0.08
evt0 = UTCDateTime(datetime.datetime(2017,9,3,3,30,1))
In [3]:
evt0
Out[3]:
2017-09-03T03:30:01.000000Z

Reading data

You can read data seismic directly into Python with an internet connection using the forensic module like below. We'll skip over the details of how to do this for now, because it's complex, but it might come in handy later. For now, we'll open the file called 'DPRK.p' that you downloaded already.

In [4]:
#st = forensic.getStreamFromIRIS(evt0-600, evt0+3000, '*', 'IU,II,IC', 'BHZ')
#pickle.dump(st, open('DPRK.p', 'wb'))
In [5]:
st = pickle.load(open('DPRK.p', 'rb'))

Exploring ObsPy

To explore data in ObsPy, you need to understand how the data are structured in an 'object' called an ObsPy Stream: https://docs.obspy.org/packages/obspy.core.html#waveform-data

In [6]:
print(st)
235 Trace(s) in Stream:

IC.BJT.00.BHZ | 2017-09-03T03:20:01.019538Z - 2017-09-03T04:20:00.969538Z | 20.0 Hz, 72000 samples
...
(233 other traces)
...
IU.YSS.00.BHZ | 2017-09-03T03:20:01.049900Z - 2017-09-03T04:20:00.999900Z | 20.0 Hz, 72000 samples

[Use "print(Stream.__str__(extended=True))" to print all Traces]
In [7]:
tr = st[0]
print(tr.stats)
               network: IC
               station: BJT
              location: 00
               channel: BHZ
             starttime: 2017-09-03T03:20:01.019538Z
               endtime: 2017-09-03T04:20:00.969538Z
         sampling_rate: 20.0
                 delta: 0.05
                  npts: 72000
                 calib: 1.0
_fdsnws_dataselect_url: http://service.iris.edu/fdsnws/dataselect/1/query
               _format: MSEED
                 mseed: AttribDict({'dataquality': 'M', 'number_of_records': 249, 'encoding': 'STEIM2', 'byteorder': '>', 'record_length': 512, 'filesize': 43102720})
              response: Channel Response
	From m/s (Velocity in Meters Per Second) to counts (Digital Counts)
	Overall Sensitivity: 5.44458e+09 defined at 0.050 Hz
	3 stages:
		Stage 1: PolesZerosResponseStage from m/s to V, gain: 3245
		Stage 2: CoefficientsTypeResponseStage from V to counts, gain: 1.67772e+06
		Stage 3: CoefficientsTypeResponseStage from counts to counts, gain: 1
                   sac: AttribDict({'stla': 40.018300000000004, 'stlo': 116.1679, 'stel': 116.1679})
In [8]:
tr.data
Out[8]:
array([2102, 1996, 1911, ..., 2359, 2396, 2611], dtype=int32)
In [9]:
%pylab notebook
plt.plot(tr.data)
Populating the interactive namespace from numpy and matplotlib
Out[9]:
[<matplotlib.lines.Line2D at 0x7f93104d4290>]
In [10]:
%pylab notebook
#fig, ax = plt.subplots()
fig = tr.plot(handle=True)
Populating the interactive namespace from numpy and matplotlib
In [11]:
for tr in st:
    print(tr.stats.station, tr.stats.network,
          tr.stats.channel, tr.stats.sampling_rate,
          round(forensic.get_dist(tr, evla, evlo)/111.1,2))
BJT IC BHZ 20.0 9.88
ENH IC BHZ 20.0 19.28
ENH IC BHZ 40.0 19.28
HIA IC BHZ 20.0 10.32
KMI IC BHZ 20.0 27.18
LSA IC BHZ 20.0 32.72
MDJ IC BHZ 20.0 3.34
MDJ IC BHZ 40.0 3.34
QIZ IC BHZ 20.0 27.68
SSE IC BHZ 20.0 12.03
WMQ IC BHZ 20.0 30.29
XAN IC BHZ 20.0 17.49
AAK II BHZ 20.0 39.92
AAK II BHZ 40.0 39.92
ABPO II BHZ 20.0 96.65
ABPO II BHZ 40.0 96.65
ALE II BHZ 20.0 56.11
ALE II BHZ 40.0 56.11
ARU II BHZ 20.0 46.53
ASCN II BHZ 40.0 133.65
BFO II BHZ 20.0 76.32
BORG II BHZ 20.0 71.5
BORG II BHZ 40.0 71.5
BRVK II BHZ 20.0 40.45
BRVK II BHZ 40.0 40.45
CMLA II BHZ 20.0 97.69
CMLA II BHZ 40.0 97.69
COCO II BHZ 40.0 61.26
DGAR II BHZ 20.0 71.12
DGAR II BHZ 40.0 71.12
EFI II BHZ 20.0 168.68
EFI II BHZ 40.0 168.68
ERM II BHZ 20.0 10.54
ERM II BHZ 40.0 10.54
ESK II BHZ 20.0 75.28
ESK II BHZ 40.0 75.28
FFC II BHZ 20.0 74.63
FFC II BHZ 40.0 74.63
HOPE II BHZ 20.0 164.0
HOPE II BHZ 40.0 164.0
JTS II BHZ 20.0 119.75
JTS II BHZ 40.0 119.75
KAPI II BHZ 20.0 47.13
KAPI II BHZ 40.0 47.13
KDAK II BHZ 40.0 50.3
KDAK II BHZ 40.0 50.3
KIV II BHZ 20.0 60.57
KIV II BHZ 40.0 60.57
KURK II BHZ 20.0 35.57
KURK II BHZ 40.0 35.57
KWJN II BHZ 40.0 46.64
KWJN II BHZ 40.0 46.64
LVZ II BHZ 20.0 53.91
LVZ II BHZ 40.0 53.91
MBAR II BHZ 20.0 96.74
MBAR II BHZ 40.0 96.74
MSEY II BHZ 20.0 81.0
MSEY II BHZ 40.0 81.0
MSVF II BHZ 20.0 74.49
MSVF II BHZ 40.0 74.49
NNA II BHZ 20.0 143.06
NNA II BHZ 40.0 143.06
OBN II BHZ 20.0 58.55
OBN II BHZ 40.0 58.55
PALK II BHZ 40.0 54.7
PFO II BHZ 20.0 83.97
PFO II BHZ 40.0 83.97
RPN II BHZ 20.0 130.74
RPN II BHZ 40.0 130.74
SACV II BHZ 40.0 118.42
SHEL II BHZ 20.0 133.8
SHEL II BHZ 40.0 133.8
SIMI II BHZ 40.0 45.21
SUR II BHZ 20.0 123.63
SUR II BHZ 40.0 123.63
TAU II BHZ 20.0 85.87
TAU II BHZ 40.0 85.87
UOSS II BHZ 20.0 61.44
UOSS II BHZ 40.0 61.44
WRAB II BHZ 20.0 61.48
WRAB II BHZ 40.0 61.48
XPFO II BHZ 40.0 83.97
XPFO II BHZ 40.0 83.97
XPFO II BHZ 40.0 83.97
XPFO II BHZ 40.0 83.97
XPFO II BHZ 40.0 83.97
ADK IU BHZ 40.0 37.82
AFI IU BHZ 20.0 77.62
AFI IU BHZ 40.0 77.62
ANMO IU BHZ 20.0 88.38
ANMO IU BHZ 40.0 88.38
ANTO IU BHZ 40.0 68.96
ANTO IU BHZ 40.0 68.96
BBSR IU BHZ 40.0 105.33
BBSR IU BHZ 40.0 105.33
BILL IU BHZ 20.0 33.39
CASY IU BHZ 20.0 108.62
CASY IU BHZ 40.0 108.62
CCM IU BHZ 20.0 92.61
CCM IU BHZ 40.0 92.61
CCM IU BHZ 40.0 92.61
CHTO IU BHZ 40.0 34.15
CHTO IU BHZ 40.0 34.15
COLA IU BHZ 20.0 50.54
COLA IU BHZ 40.0 50.54
COR IU BHZ 20.0 72.52
COR IU BHZ 40.0 72.52
COR IU BHZ 40.0 72.52
CTAO IU BHZ 20.0 63.48
CTAO IU BHZ 40.0 63.48
DAV IU BHZ 40.0 34.4
DAV IU BHZ 40.0 34.4
DWPF IU BHZ 20.0 105.15
DWPF IU BHZ 40.0 105.15
FUNA IU BHZ 40.0 67.81
FUNA IU BHZ 40.0 67.81
FURI IU BHZ 20.0 84.51
FURI IU BHZ 40.0 84.51
GNI IU BHZ 20.0 61.22
GNI IU BHZ 40.0 61.22
GUMO IU BHZ 40.0 30.96
GUMO IU BHZ 40.0 30.96
HKT IU BHZ 20.0 97.63
HKT IU BHZ 40.0 97.63
HNR IU BHZ 40.0 58.19
HNR IU BHZ 40.0 58.19
HRV IU BHZ 20.0 94.23
HRV IU BHZ 40.0 94.23
HRV IU BHZ 40.0 94.23
INCN IU BHZ 20.0 4.27
INCN IU BHZ 40.0 4.27
JOHN IU BHZ 40.0 57.74
JOHN IU BHZ 40.0 57.74
KBS IU BHZ 20.0 54.47
KBS IU BHZ 40.0 54.47
KEV IU BHZ 20.0 55.66
KEV IU BHZ 40.0 55.66
KIP IU BHZ 20.0 63.53
KIP IU BHZ 40.0 63.53
KIP IU BHZ 40.0 63.53
KMBO IU BHZ 20.0 92.2
KMBO IU BHZ 40.0 92.2
KMBO IU BHZ 20.0 92.2
KONO IU BHZ 20.0 67.56
KONO IU BHZ 40.0 67.56
KOWA IU BHZ 40.0 109.47
LCO IU BHZ 20.0 159.91
LCO IU BHZ 40.0 159.91
LCO IU BHZ 40.0 159.91
LSZ IU BHZ 20.0 108.2
LVC IU BHZ 20.0 156.11
LVC IU BHZ 40.0 156.11
MA2 IU BHZ 20.0 22.71
MA2 IU BHZ 40.0 22.71
MAJO IU BHZ 20.0 8.54
MAJO IU BHZ 40.0 8.54
MAJO IU BHZ 40.0 8.54
MAKZ IU BHZ 40.0 33.81
MBWA IU BHZ 40.0 63.11
MBWA IU BHZ 40.0 63.11
MIDW IU BHZ 40.0 45.18
NWAO IU BHZ 20.0 75.09
NWAO IU BHZ 40.0 75.09
OTAV IU BHZ 20.0 131.68
OTAV IU BHZ 40.0 131.68
PAB IU BHZ 20.0 88.82
PAB IU BHZ 40.0 88.82
PAYG IU BHZ 20.0 126.16
PAYG IU BHZ 40.0 126.16
PET IU BHZ 20.0 23.05
PMG IU BHZ 20.0 53.41
PMG IU BHZ 40.0 53.41
PMG IU BHZ 40.0 53.41
PMSA IU BHZ 20.0 155.48
PMSA IU BHZ 40.0 155.48
POHA IU BHZ 20.0 66.39
POHA IU BHZ 40.0 66.39
PTCN IU BHZ 40.0 114.14
PTCN IU BHZ 40.0 114.14
QSPA IU BHZ 20.0 131.34
QSPA IU BHZ 40.0 131.34
QSPA IU BHZ 40.0 131.34
QSPA IU BHZ 40.0 131.34
QSPA IU BHZ 20.0 131.34
RAO IU BHZ 40.0 85.94
RAO IU BHZ 40.0 85.94
RAR IU BHZ 20.0 90.79
RAR IU BHZ 40.0 90.79
RCBR IU BHZ 20.0 142.2
RCBR IU BHZ 40.0 142.2
RSSD IU BHZ 20.0 82.27
RSSD IU BHZ 40.0 82.27
SAML IU BHZ 20.0 146.0
SAML IU BHZ 40.0 146.0
SBA IU BHZ 40.0 121.44
SBA IU BHZ 40.0 121.44
SDV IU BHZ 40.0 126.75
SFJD IU BHZ 20.0 71.76
SFJD IU BHZ 40.0 71.76
SJG IU BHZ 40.0 119.04
SJG IU BHZ 40.0 119.04
SLBS IU BHZ 20.0 95.18
SLBS IU BHZ 40.0 95.18
SNZO IU BHZ 20.0 92.43
SNZO IU BHZ 40.0 92.43
SSPA IU BHZ 20.0 94.57
SSPA IU BHZ 40.0 94.57
TARA IU BHZ 40.0 56.18
TARA IU BHZ 40.0 56.18
TATO IU BHZ 20.0 17.51
TATO IU BHZ 40.0 17.51
TEIG IU BHZ 40.0 109.49
TEIG IU BHZ 40.0 109.49
TEIG IU BHZ 40.0 109.49
TIXI IU BHZ 20.0 30.36
TRIS IU BHZ 40.0 150.16
TRQA IU BHZ 20.0 171.05
TRQA IU BHZ 40.0 171.05
TSUM IU BHZ 20.0 118.59
TSUM IU BHZ 40.0 118.59
TUC IU BHZ 20.0 88.13
TUC IU BHZ 40.0 88.13
TUC IU BHZ 40.0 88.13
ULN IU BHZ 20.0 16.95
ULN IU BHZ 40.0 16.95
WAKE IU BHZ 40.0 38.77
WAKE IU BHZ 40.0 38.77
WCI IU BHZ 20.0 94.25
WCI IU BHZ 40.0 94.25
WVT IU BHZ 20.0 95.59
WVT IU BHZ 40.0 95.59
XMAS IU BHZ 40.0 76.35
XMAS IU BHZ 40.0 76.35
YAK IU BHZ 20.0 20.75
YSS IU BHZ 20.0 11.32
In [12]:
%pylab notebook

ax = plt.axes(projection=ccrs.PlateCarree())
ax.coastlines()

for tr in st:
    ax.plot(tr.stats.sac.stlo, tr.stats.sac.stla, 'y^', transform=ccrs.Geodetic())
ax.plot(evlo, evla, 'r*', transform=ccrs.Geodetic())
ax.gridlines(draw_labels=True)
plt.show()
Populating the interactive namespace from numpy and matplotlib

Exploring the DPRK data

We'll plot 'record sections' which are plots that show multiple seismograms, arranged according to the location and origin time of an event (in this case, the September 2017 DPRK test).

In [13]:
%pylab notebook

forensic.plot_recordsection_global(st, evlo, evla, 
                                   date2num(evt0.datetime),
                                   scaleFactor=1.5)
plt.xlim([0., 160.])
plt.ylim([0., 50.])
Populating the interactive namespace from numpy and matplotlib
Out[13]:
(0.0, 50.0)
In [14]:
%pylab notebook

st_f = st.copy()
st_f.taper(type='cosine', max_percentage=0.001)
st_f.filter('highpass', freq=1)

forensic.plot_recordsection_global(st_f, evlo, evla, 
                                   date2num(evt0.datetime),
                                   scaleFactor=1.5)
plt.xlim([0., 160.])
plt.ylim([0., 50.])
Populating the interactive namespace from numpy and matplotlib
Out[14]:
(0.0, 50.0)
In [15]:
%pylab notebook

st_f = st.copy()
st_f.taper(type='cosine', max_percentage=0.001)
st_f.filter('lowpass', freq=0.1)

forensic.plot_recordsection_global(st_f, evlo, evla, 
                                   date2num(evt0.datetime),
                                   scaleFactor=1.5)
plt.xlim([0., 160.])
plt.ylim([0., 50.])
Populating the interactive namespace from numpy and matplotlib
Out[15]:
(0.0, 50.0)

Estimating mb

Following the new IASPEI standards for magnitude estimation (Borman and Dewey, 2012), if $A$ is calculated in nm (Peak-to-Peak) and $T$ is calculated in s, the body wave magnitude can be calculated from:

$ mb = \log_{10} (A/T) + Q(\Delta, h) - 3.0$

Where:

$A$ = P-wave ground amplitude in nm calculated from the maximum trace-amplitude in the entire P-phase train (time spanned by P, pP, sP, and possibly PcP and their codas, and ending preferably before PP);

$T$ = period in seconds, T < 3 s; of the maximum P-wave trace amplitude.

$Q(\Delta, h)$ = attenuation function for PZ (P-waves recorded on vertical component seismographs) established by Gutenberg and Richter (1956a) in the tabulated or algorithmic form as used by the U.S. Geological Survey/National Earthquake Information Center (USGS/NEIC)

$\Delta$ = epicentral distance in degrees, 20° ≤ $\Delta$ ≤ 100°;

$h$ = focal depth in km;

and where both $T$ and the maximum trace amplitude are measured on output from a vertical-component instrument that is filtered so that the frequency response of the seismograph/filter system replicates that of a WWSSN short-period seismograph, with $A$ being determined by dividing the maximum trace amplitude by the magnification of the simulated WWSSN-SP response at period $T$.

We'll begin by plotting the attenuation function, $Q(\Delta, h)$:

In [16]:
%pylab notebook

delta, depth, mb_q = forensic.q_for_mb()

plt.pcolormesh(delta, depth, mb_q.transpose(), cmap=plt.get_cmap('bone_r'))
plt.xlabel('Distance $(^{\circ})$')
plt.ylabel('Depth (km)')
plt.colorbar()
plt.show()
Populating the interactive namespace from numpy and matplotlib

Analyzing data from MAKZ

We'll start by analyzing data from MAKZ, a seismic station in Kazakstan that is part of the Global Seismographic Network (GSN) network. MAKZ is in the set of distance ranges in which it is suitable to measure a mb magnitude (33.8 degrees).

In [17]:
forensic.get_dist?
In [18]:
tr = st.select(station='MAKZ')[0].copy()
In [19]:
forensic.get_dist(tr, evla, evlo)/111.1
Out[19]:
33.80767064940964

What value of $Q(\Delta, h)$ should we use in estimating an mb for MAKZ? We know the event is near the surface, so we can index into our matrix of mb_q values for the closest tabulated distance and a source at 0 km depth.

In [20]:
print(delta.shape)
print(mb_q.shape)
(81,)
(81, 17)
In [21]:
ix = np.argmin(np.abs(delta - forensic.get_dist(tr, evla, evlo)/111.1))
mb_q[ix,0]
Out[21]:
6.7000000000000002

Plotting the raw data

The amplitudes are in 'counts', which have no physical meaning but are simply related to the data logging system.

In [22]:
%pylab notebook

t = np.arange(0, tr.stats.delta*tr.stats.npts, tr.stats.delta)
plt.plot(t, tr.data, '-')
plt.xlabel('Time (s) after ' + str(tr.stats.starttime).split('.')[0])
plt.ylabel('Counts')
Populating the interactive namespace from numpy and matplotlib
Out[22]:
Text(0, 0.5, 'Counts')

The metadata on the trace contains the response information, however. It tells us all the information we need to convert from counts to units of meters/second (because the sensor is measuring ground velocity).

In [23]:
print(tr.stats)
               network: IU
               station: MAKZ
              location: 00
               channel: BHZ
             starttime: 2017-09-03T03:20:01.019538Z
               endtime: 2017-09-03T04:20:00.994538Z
         sampling_rate: 40.0
                 delta: 0.025
                  npts: 144000
                 calib: 1.0
_fdsnws_dataselect_url: http://service.iris.edu/fdsnws/dataselect/1/query
               _format: MSEED
              distance: 3756032.209149411
                 mseed: AttribDict({'dataquality': 'M', 'number_of_records': 331, 'encoding': 'STEIM2', 'byteorder': '>', 'record_length': 512, 'filesize': 43102720})
              response: Channel Response
	From m/s (Velocity in Meters Per Second) to counts (Digital Counts)
	Overall Sensitivity: 2.4478e+09 defined at 0.020 Hz
	3 stages:
		Stage 1: PolesZerosResponseStage from m/s to V, gain: 1459
		Stage 2: CoefficientsTypeResponseStage from V to counts, gain: 1.67772e+06
		Stage 3: CoefficientsTypeResponseStage from counts to counts, gain: 1
                   sac: AttribDict({'stla': 46.808, 'stlo': 81.977000000000004, 'stel': 81.977000000000004})

We're going to have to do some manipulation to the data in order to get it into units of displacement, and then simulate the response of the WWSSN SP seismometer, which is what the mb magnitude scale is calibrated for. Fortunately, ObsPy (using mathematical methods called deconvolution and convolution, which are covered in Digital Signal Processing) comes to the rescue!

We first remove the response of the instrument, and convert to displacement (this performs a numerical integration of the time series). We then filter the data using a highpass and lowpass filter in order to simulate the response of the WWSSN SP seismometer (what it would measure). The filtered seismogram has lost a lot of the low-frequency energy and accentuates the P waves.

We then convert the displacements from meters to nanometers and plot the seismogram. Now, we can measure the amplitude and period and apply the mb formula above.

In [24]:
%pylab notebook

# Converting the seismogram to physical displacement:
tr_disp = tr.copy()
tr_disp.remove_response(output='DISP')

# Simulating the response of a WWSSN short-period instrument with a high and low pass filter:
tr_disp_filt = tr_disp.copy()
tr_disp_filt.taper(type='cosine', max_percentage=0.001)
tr_disp_filt.filter('highpass', freq=1, corners=3)
tr_disp_filt.filter('lowpass', freq=2.8, corners=2)

# Plotting the data (in nanometers):
plt.plot(t, 1e9*tr_disp_filt.data, '-')
plt.ylabel('Displacement (nanometers)')
plt.xlabel('Time (s) after ' + str(tr.stats.starttime).split('.')[0])
Populating the interactive namespace from numpy and matplotlib
Out[24]:
Text(0.5, 0, 'Time (s) after 2017-09-03T03:20:01')

Exercise

Use the recording above to calculate a mb for this station. Next, look at seismograms from a few other stations at suitable distance ranges and calculate magnitudes to estimate a mean and standard deviation of the magnitudes.

In [ ]:
 

Analyzing data from MDJ

The New China Digital Seismograph network station MDJ is one of the closest open stations to the DPRK test site, and has been used extensively in analysis of those events.

In [25]:
tr = st.select(station='MDJ')[1].copy()
In [26]:
g = Geod(ellps='sphere')
_,_,dist = g.inv(evlo,evla,tr.stats.sac.stlo,tr.stats.sac.stla); dist = dist/1000
print(dist/111.1)
3.3416689153159598
In [27]:
%pylab notebook

t = forensic.get_time_for_trace(tr)
plt.plot_date(t, tr.data, '-')
plt.xlim(date2num(evt0.datetime)+30/86400, date2num(evt0.datetime)+200/86400)
Populating the interactive namespace from numpy and matplotlib
Out[27]:
(736575.1461921296, 736575.1481597222)

Exercise

Reproduce Figure 1 in Myers et al. (2018).

Hint: You can request data from only MDJ as follows:

st = forensic.getStreamFromIRIS(evt0-600, evt0+3000, 'MDJ', 'IC', 'BHZ')

In [ ]: