Using A Validation Server (Synthetic Data, a tutorial series)
Learn the functionalities, the benefits, and the processes behind a validation server in the context of data synthesis
This webinar extends the attendees’ knowledge of data synthesis with deep dives into the use of data simulators and the use of validation servers.
Data simulators are stored generative models that can be used to generate new data. Instead of making data available to users in raw form, models can be provided to data consumers. These are what we call “simulators”. Users can then run these simulators to generate new datasets that have the same properties as original datasets. The first part of the this webinar will illustrate the use of simulators.
A validation server (also sometimes referred to as a verification server) is used to execute analytics code within a secure environment. Data analysts can get synthetic datasets and develop their modeling code on that synthetic data. Because the synthetic data is similar to the real data in structure, the code can then be sent to the verification server for execution. That way the analyst can validate their results on the real data without getting access to the real data.
The focus of this webinar will be to demonstrate the concepts behind simulators and verification servers and illustrate how they can be applied in practice.
PRESENTERS: This session will be presented by Lucy Mosquera, lead statistician at Replica Analytics.
TARGET AUDIENCE: This is intended to be an interactive and hands-on session where the participants will have the opportunity to practice the concepts on our training data synthesis platform. The target audience for this tutorial are data scientists, data analysts, and statistical programmers. Some knowledge of R is highly desirable as some of the exercises will require the use of R.
BACKGROUND READING: As background reading for this tutorial series, the book Practical Synthetic Data Generation from O’Reilly is a recommended introduction to data synthesis and the concepts behind the different methodologies that are typically used.