At Microsoft, I learned a great deal about security.
One of my top five project lists, called “Always Encrypted” that we built in the Azure SQL Database. In 2016, our customers didn’t trust the cloud with their data.
Customers wanted to ensure that their sensitive data, like credit card numbers or Social Security numbers, was always encrypted at rest and in transit. That was the reason why we built that Always-Encrypted feature to meet this unmet demand.
Essentially it was three big goals.
- We wanted to ensure that data stays encrypted, at rest, and in transit.
- Protect data from man-in-the-middle attacks.
- Prevent data access to unauthorized users, even highly privileged users.
What always encrypted does is it allows clients to encrypt sensitive data inside their applications and never reveal the encryption keys to their database engine. Everything happens client side. From a client perspective, it provides a clear separation between those who own and can view the data.
For example, app developers and end users manage the data but should not have access. Essentially always encrypted made sure that your data is protected even if somebody steals your backup.
We built new capabilities in the driver to do three things.
- Automatically encrypt data in sensitive columns on the client side before sending it to the database engine.
- Automatically decrypt the data in sensitive clients when it is returned in the query. So, then you do a select, get back columns, and decrypt on the fly.
- You would automatically re-write queries on the client side so that the client semantics of the query were present.
Once configured always encrypted, making it completely transparent for application writers. And if there was ever a man-in-the-middle attack, the data on the wire stays encrypted. The results of this were significant because this changed customer perception about putting data in the cloud.
There’s one more project called Dynamic Data Masking.
The premise is that data is stored in the database, and then various forms and user interfaces are built to view that data.
So essentially, you’ll walk up to the user interface and log in as Sanjay; you want to be able to replace, let’s say, the Social Security number stored in the form with ****. Vs. If you log in as Venky, you don’t want to ****.
To enable this, we built a capability called dynamic data masking in the Azure SQL database. You could walk up to the database, pick a column, and mask the data in the column.
We had standard patterns for social security numbers, credit cards and a bunch of national IDs. But you could specify your own pattern, which the server would then enforce.
And by the time it comes to the client, the data is already masked, and it’s, again, based on RBAC. So, depending on who logs in and who has the option to see that, you either see the data or do not see the data.
So typical scenario for this was, let’s say, Venky entering some data. Sanjay peeks over Venky’s shoulder to see credit card numbers. But guess what? That’s not possible anymore.