基于泛化地震样本训练的全卷积神经网络在2016年熊本MW6.2地震监测中的应用

Application of fully convolutional neural network trained on generalized seismic samples in the 2016 Kumamoto MW6.2 earthquake monitoring

  • 摘要: 针对近期提出的通用全卷积神经网络模型,以2016年日本熊本MW6.2地震序列为例,验证其预警时效性、定位精度和跨震级泛化性能。通过选取震源主要影响区域作为监测范围,并向模型输入12个台站的波形数据,成功实现了震后1小时179次余震序列中69次余震的定位。结果表明:通用全卷积神经网络模型无需迁移学习即可在P波抵达首个台站后的4.4—6.4 s发布警报,并提供可靠的震源参数(震中均方根误差为3.409 km,深度均方根误差为3.787 km),展现出优异的泛化能力与实用性,为地震预警系统的快速部署提供了潜在解决方案。

     

    Abstract: Earthquake monitoring is one of the core tasks of seismology research. In recent years, significant progress has been made in earthquake monitoring and positioning methods based on neural networks. Among them, the neural network method based on waveform analysis has become a research hotspot due to its advantages in feature extraction and real-time processing; meanwhile, the method using travel time information also shows important potential in this rapidly developing field. However, most existing deep learning models still face a widespread and unresolved basic limitation: seriously insufficient generalization ability, which is usually confined to specific geographic areas and network configurations adopted in the training stage. This limitation significantly hinders the practical application of such methods in global or regional earthquake early warning systems, which have extremely high requirements for rapid deployment and operational flexibility. In response to this key challenge, this study systematically implemented and rigorously evaluated a newly proposed general fully convolutional neural network (FCN) model. The model was trained on large-scale diversified generalized seismic samples, and the 2016 MW6.2 Kumamoto earthquake sequence — a case with high complexity and scientific value — was used as a comprehensive verification object to focus on the quantitative evaluation of its early warning timeliness, multi-dimensional positioning accuracy, and cross-magnitude generalization ability in real operating scenarios. The complex FCN architecture is designed for real-time seismic data processing, and consists of three parallel dedicated sub-networks, which are responsible for event detection, source location and magnitude estimation simultaneously through end-to-end analysis of continuous three-component waveform data streams. The excellent generalization ability of the model is mainly due to its innovative training paradigm, which uses advanced data reorganization technology to construct a massive synthetic data set covering a wide range of source-station configuration variability. This data generation process effectively simulates various real station geometric layouts and source location scenarios, and strictly follows the basic physical laws of seismic wave propagation, thereby enabling the model to learn inherently transferable physical characteristics instead of merely memorizing specific network configurations. In the detailed experimental verification, we selected the mainshock-affected key area as the monitoring area, input the systematically preprocessed waveform data from 12 reasonably deployed stations into the pre-trained model, and successfully identified and located 69 obvious aftershocks within the critical first hour after the mainshock. The comprehensive analysis results show that the model can release reliable alarms within 4.4−6.4 seconds after the P wave arrives at the first trigger station without any transfer learning or region-specific parameter adjustment, and provide robust estimation of basic source parameters. The statistical results show that the root mean square error of the epicenter determination of all successful positioning events is 3.409 km, and the root mean square error of depth estimation is 3.787 km. The system exhibits consistently excellent practicability in practical applications and maintains performance stability for seismic events of different magnitude ranges in complex sequences. However, critical assessment also reveals several limitations requiring further research: The complete system has an aftershock detection rate of 38.5%, with underreported events mainly concentrated in small-magnitude events or spatio-temporal clustering sequences disturbed by waveforms; meanwhile, the accuracy of depth estimation remains lower than that of horizontal positioning, which is speculated to be associated with the simplified one-dimensional velocity model adopted in the training stage. In summary, the generalized FCN method provides a promising and feasible technical path for the rapid deployment of complex earthquake early warning systems worldwide, and achieves a better balance between calculation speed, operation accuracy and cross-tectonic environment generalization ability. Future research should focus on improving the detection sensitivity of small earthquakes and cluster events, introducing a more realistic three-dimensional velocity structure to enhance depth resolution, and optimizing the network architecture to meet the operational requirements in resource-constrained environments.

     

/

返回文章
返回