Deep Statistical Model Checking is a new technique to get insights in the quality of a policy induced by a neural network trained on a specific task which is used as a scheduler later to determinize the MDP model of a similar task.
In the original work we considered the simple Racetrack use case in which the neural network has been trained on the task of reaching the goal with a probability as high as possible while crashing as rarely as possible. For the resulting stochastic process, we harvest state-of-the-art statistical model checking techniques of The Modest Toolset to study the detailed behavior of the net. More concretely, we treat the NN as a black box to resolve the nondeterminism in the given MDP of the model. The NN gets a description of the current state and returns the action to apply next. The statistical analysis of the resulting Markov chain and thereby of the NN properties gives insights in the quality of the NN, not only for the whole task but also for specific regions of the Racetrack.
An impression of this quality analysis is given in the figure below, where a simple heat map visualizes the chance to safely reach the goal if starting in each of the cells along the track. On the left, the goal probabilities are smaller, which indicated that the net could be retrained to achieve better results.
DSMC enables the inspection of risky driving behavior induced by the NN. Such information can be used to retrain the net in a certain area of the map to improve quality or to see if the net prefers a specific route over an equivalent one. In a nutshell, DSMC provides a scalable verification method of NNs.
The whole artifact of the DSMC work is available online. In this artifact you can find a document with additional explanations about the modelling details, the executable of The Modest Toolset with the Oracle option to resolve nondeterminism in modes used in the paper, the scripts used for learning as well as the weights and biases of the final NNs and the logfiles for the experiments presented in the paper.
The video of the FORTE 2020 conference talk, where we presented DSMC:
It is also available here.