The primary challenge in leveraging connected vehicle data stems from two facts:
As a result, inferring structure from this time series data is prone to overfitting – and finding true signals is akin to finding a needle in a haystack.
Viaduct solves this problem through our patent-pending Temporal Structural Inference (TSI) Engine. The engine is built to construct relationships between high-dimensional time series data, particularly in cases of rare downstream events like component quality issues.
Viaduct spun out of a Ph.D. research lab at Stanford, and the foundational research is some of the earliest work in academia in the field of scalable heterogeneous time series analytics.
Spotting trends, detecting anomalies, and making predictions on time series data requires the ability to infer 1) structural relationships (i.e., correlations) between different entities; and 2) how these relationships evolve over time.
Each of those in itself is challenging, especially when the data is high-dimensional and heterogeneous (time series sensor readings, asynchronous DTCs, static build data, and unstructured data like dealer notes).
Identifying correlations between all these dimensions is a large-scale combinatorial optimization problem: there is no polynomial time solution to efficiently compute a solution.
Viaduct’s TSI Engine models complex, high-dimensional data as measurements of latent vehicle factors. A factor is an inferred set of relationships between vehicle data sources, capturing both:
- The relationship between subsystems and components under normal operating conditions (e.g., a “healthy” correlation between a cluster of sensor readings) and
- How raw data sources correlate with component failures (e.g., the correlation between sensor readings and a certain DTC or warranty claim)
The technical implementation of this approach involves using a convex approximation and estimating a sparse inverse covariance matrix, which reveals a dynamic network of interdependencies between entities. Regularization of the network helps avoid overfitting and allows for interpretable factors to be consumed. Since dynamic network inference is a computationally expensive task, we leverage an alternating minimization approach to solve this problem efficiently.
Correlation mapping plays a key role in Viaduct's ability to detect anomalies and predict failures.
Viaduct maintains a latent representation of vehicle health as a set of relationships between factors. Deviations from known relationship patterns signal potential anomalies.
Inferring the relationship between vehicle factors doesn't just help identify emerging issues. It also enables isolation of the most affected populations – by identifying vehicle clusters with the highest incidence rates (e.g., vehicles with a certain engine component assembled in one plant between May and August 2022, driven primarily above 4,000 feet in elevation).
Read more about the background research.
Independently modeling failure modes – when failures are extremely rare and the data is highly dimensional – runs the risk of overfitting as additional factors are added.
Viaduct’s correlation engine automatically identifies related components and subsystems – and jointly regularizes models trained to predict when these components will fail. Enabling a failure model for a given component to “learn” from related components improves model performance without increasing the risk of overfitting.
Read more about the background research.
If you’re interested in learning more about Viaduct’s TSI Engine, feel free to check out some of our research – or reach out for a demo to see it in action.